Internet Telephony
Overview   
Legacy Voice Services                      
VoIP Functions:
   
Signaling
    Database Services
    Call Connect & Disconnect
    Codec Operations

VoIP Components:
   Media Gateways
   Gateway Controllers
   IP Network
Voice Protocols & Usages
Signaling System Seven
H.323
Real-Time Transport Protocol
Transport Control Protocol
Media Gateway Control Protocol
Session Initiation Protocol
Signaling Transport
Megaco/H.248
Resource Reservation Protocol
VoIP Service Consideration
    Latency
    Jitter
    Bandwidth
    Packet Loss
    Reliability
    Security 
Economic Issues

Internet Telephony Operations
Hardware and Software  Requirements
Directory Services
Constraints and Compatibility Issues
Lan Connectivity Operation
        
VPN (Virtual Private Network)

 

 

 

 

 

 

 

 

 

 

 

Internet Telephony                                                                      TOP      

In this web page, we will use the term Internet telephony to refer to the transmission of digitized voice conversations over the Internet by individual PC users. The technology associated with Internet telephony is primarily based on the use of sound cards installed in a PC, a microphone connected to the sound card, and appropriate software. However, later  we will note that there are many flavors of Internet telephony to include phone-to-phone transported over an IP network as well as PC-to-PC and PC-to-phone. Because of the difficulty in classifying these techniques as an individual method used by a consumer or as individual techniques cumulatively used by businesses.

Overview

The basic operation of an Internet telephony system commences when a person talks into a microphone. The microphone is in turn connected to a sound card installed in the computer, which accepts an analog waveform and converts it into a digital data stream. Internet telephony software operating on the computer takes the digitized voice data stream, which normally represents a 64-Kbps PCM or a 32-Kbps ADPCM-encoded voice, and compresses the standard encoding data stream into a lower data rate based on the use of a proprietary or standardized voice-compression technique. Once this is accomplished, the software packages the digitized and compressed data stream into packets using a protocol for transmission over the Internet. Most Internet telephony products were originally developed primarily to support modem connections; however, modern products also support LAN-based operations when the LAN is connected to the Internet.

There are two primary transport protocols used for an Internet telephony session. TCP is used to transport addressing or directory information, while UDP is used for the actual transfer of voice-digitized packets. Although the actual ability to digitize voice entered through a microphone is a relatively simple process, differences in the manner in which connections are established over the Internet, voice-digitization methods, and the framing of digitized voice samples result in a high degree of incompatibility between vendor products. Before turning our attention to the operation of specific products, let's digress a bit and discuss the economic issues associated with Internet telephony and its basic operation.

Legacy Voice Services                                    TOP

Understanding how the public switched telephone network (PSTN) function is useful for discussions on VoIP technology. Hence, this section briefly describes how the current PSTN works. There are four major tasks the PSTN must perform to connect a call. Although there are other services besides an end-to-end voice call 
(for example, conference calls and other services), they are based on the following requirements.

Phone calls are inherently connection-oriented. That is, the connection to the called person must be established ahead of time before the conversation can occur. Switches, the central components in a PSTN, are responsible for creating this connection. Between the circuit switches are connections (trunk links) that carry the voice traffic. These links vary in speed from T1 and E1 to OC-192c/STM-64, with individual channels (DS-0s) in each link type representing one voice channel. Switches are also responsible for converting the analog signal (voice) to a digital format that is transported across the network.

Signaling notifies both the network and its users of important events. Examples of signaling range from the ringer activation letting you know that a call is coming, to the dialing of digits used to make a call. Network elements also use signaling to create connections through a network. 

The Signaling System Seven (SS7) network is a packet-based (connectionless) network that transports the signaling traffic between the switches involved in the call. The service control points (SCPs) are the databases that execute the queries to translate phone numbers into circuit-switching details. They also make it possible for such features as 800 number support, 911 service, and caller ID. Signaling switch points (SSP) are the interfaces between the circuit switching equipment and the SS7 network. It is here where SS7 messages are translated into the connection details that the switch needs to connect a call.

Generally, the SS7 control network is out of band (not included) with the same links used to carry the actual voice channels. Specialized equipment called signal transfer points (STPs) transport the signaling messages. These STPs are analogous to IP routers in that the messages are carried in packets called the message transfer parts (MTPs).

The SS7 network is quite extensive (a large collection of networks) and is deployed throughout most of the developed world. There are many technical and historical reasons why the signaling portion of the network is broken out from the rest of the system. However, the greatest value in such a design is to enable you to add network intelligence and features without a dependency on the underlying circuit-switching infrastructure.

When someone picks up the phone receiver, the public switch is alerted and prepares for the phone number digits to be dialed. This phone switch might be a private branch exchange (PBX) in the same building as the phone or a public switch that is miles away. As the digits are dialed, the originating switch analyzes the digits to see if they are valid and if the destination phone is connected to this same switch. If the call is a local call (not outside the exchange), the switch connects the logical channels of the phones involved and the call is completed.

If the call is not local (for example, an 800 number), the originating switch directs a message to a database. Note the query might not be resolved directly by any particular database and that other provider databases might resolve the requested connection. The initial query results in the intervening switches connecting the logical channels that lead to the destination phone. The destination switch signals the destination phone by activating the ringer. The called party has the option to answer the phone and complete the connection.

When the conversation takes place, the switches at this point must be able to convert the voice (analog signal) into a digital form for transport over the network. Once the call is completed, the switches notify the rest of the network to tear down the connections. There are many more details to this transaction; however, these steps describe the basic flow of events in completing a call (Figure 1). In addition, there are a great many supervisory messages that are passed along the network, such as ringing indication, busy signal, and hang up.

 

VoIP Functions                                                    TOP

VoIP components must be able perform the same features as the PSTN network.

Signaling

Signaling in a VoIP network is just as critical as it is in the legacy phone system. The signaling in a VoIP network activates and coordinates the various components to complete a call. Although the underling nature of the signaling is the same, there are some technical and architectural differences in a VoIP network. Signaling in a VoIP network is accomplished by the exchange of IP datagram messages between the components. The format of these messages is covered by any number of standard protocols. Regardless of which protocol and product suites that are used, these message streams are critical to the function of a voice-enabled network and might need special treatment to guarantee their delivery.

Database Services                                                        TOP

Database services are a way to locate an endpoint and translate the addressing that two (usually heterogeneous) networks use. For example, the PSTN uses phone numbers to identify endpoints, while a VoIP network could use an IP address (address abstraction could be accomplished with DNS) and port numbers to identify an endpoint. A call control database contains these mappings and translations. Another important feature is the generation of transaction reports for billing purposes. You can employ additional logic to provide network security, such as to deny a specific endpoint from making overseas calls on the PSTN side. This functionality, coupled with call state control, coordinates the activities of the elements in a VoIP network.

Call Connect and Disconnect (Bearer Control)

The connection of a call is made by two endpoints opening communications sessions between each other. In the PSTN, the public (or private) switch connects logical DS-0 channels through the network to complete the calls. In a VoIP implementation, this connection is a multimedia stream (audio, video, or both) transported in real time. This connection is the bearer channel and represents the voice or video content being delivered. When communication is complete, the IP sessions are released and optionally network resources are freed.

CODEC Operations                                                            TOP

Voice communication is analog, while data networking is digital. The process of converting analog waveforms to digital information is done with a coder-decoder (CODEC, which is also known as a voice coder-decoder [VOCODER]). There are many ways an analog voice signal can be transformed, all of which are governed by various standards. The process of conversion is complex and beyond the scope of this paper. Suffice to say that most of the conversions are base on pulse coded modulation (PCM) or variations. Each encoding scheme has its own history and merit, along with its particular bandwidth needs.

In addition to performing the analog to digital conversion, CODECs compress the data stream, and provide echo cancellation. Compression of the represented waveform can afford you bandwidth savings. The bandwidth savings for the voice services can come in several forms and work at different levels. For example, analog compression can be part of the encoding scheme (algorithm) and does not need further digital compression from the higher working layers of the media gateway application. Another way to save bandwidth is the use of silence suppression, which is the process of not sending voice packets between the gaps in human conversations. Using compression and/or silence suppression can result in sizable bandwidth savings. However, there are some applications that could be adversely affected by compression. One example is the impact on modem users. Compression schemes can interfere with the functioning of modems by confusing the constellation encoding used. The result could be modems that never synchronize or modems that exhibit very poor throughput. Some gateways might implement some intelligence that can detect modem usage and disable compression. Another potential issue deals with low-bit-rate speech compression schemes, such as G.729 and G.723.1. These encoding schemes try to reproduce the subjective sound of the signal rather than the shape of the waveform. A greater amount of packet loss or severe jitter is more noticeable than that of a non-compressed waveform. However, some standards might employ interleaving and other techniques that can minimize the effects of packet loss.The output from the CODECs is a data stream that is put into IP packets and transported across the network to an endpoint. These endpoints must use the standards, as well as a common set of CODEC parameters. The result of using different standards or parameters on both ends is unintelligible communication. Table 1 lists some of the more important encoding standards covered by the International Telecommunications Union (ITU). As you can see, there is a price paid for reduced bandwidth consumption by increased conversion delay.

 

Table 1: ITU Encoding Standards

ITU Standard Description Bandwidth (Kbps) Conversion Delay (ms)
G.711 PCM 64   < 1.00
G.721 ADPCM 32,16,24,40   < 1.00
G.728 LD-CELP 16   ~ 2.50
G.729 CS-ACELP 8 ~ 15.00
G.723.1 MultiRate CELP 6.3,5.3 ~ 30.00

VoIP Components                                                TOP

The major components of a VoIP network are very similar in functionality to that of a circuit-switched network. VoIP networks must perform all of the same tasks that the PSTN does, in addition to performing a gateway function to the existing public network. Although using different technology and approach, some of the same component concepts that make up the PSTN also create VoIP networks. There are three major pieces to a VoIP network.

Media Gateways

Media gateways are responsible for call origination, call detection, analog-to-digital conversion of voice, and creation of voice packets (CODEC functions). In addition, media gateways have optional features, such as voice (analog and/or digital) compression, echo cancellation, silence suppression, and statistics gathering.

The media gateway forms the interface that the voice content uses so that it can be transported over the IP network. Media gateways are the sources of bearer traffic. Typically, each conversation (call) is a single IP session transported by a Real-time Transport Protocol (RTP) that runs over UDP. Media gateways exist in several forms. For example, media gateways could be a dedicated telecommunication equipment chassis, or even a generic PC running VoIP software. Their features and services can include some or all of the following.

Media Gateway Controllers                                                TOP

Media gateway controllers house the signaling and control services that coordinate the media gateway functions. Media gateway controllers could be considered similar to that of H.323 gatekeepers. The media gateway controller has the responsibility for some or all of the call signaling coordination, phone number translations, host lookup, resource management, and signaling gateway services to the PSTN (SS7 gateway). The amount of functionality is based on the particular VoIP enabling products used.

In a scalable VoIP network, you can breakup the role of a controller into signaling gateway controller and media gateway controller. For calls that originate and terminate within the domain of the VoIP network, only a media gateway controller might be needed to complete calls. However, a VoIP network is frequently connected to the public network. You could use a signaling gateway controller to directly connect to the SS7 network, while also interfacing to the VoIP network elements. This signaling controller would be dedicated to the message translation and signaling needed to bridge the PSTN to the VoIP network.

The services of these devices are defined by the protocols and software they are running. There are several protocols and implementations that any number of vendors could deploy. Knowing the details of how the devices use their suite of protocols is important to designing the IP backbone that is to service the VoIP elements.

IP Network                                                                            TOP

You can view the VoIP network as one logical switch. However, this logical switch is a distributed system, rather than that of a single switch entity; the IP backbone provides the connectivity among the distributed elements. Depending on the VoIP protocols used, this system as a whole is sometimes referred to as a softswitch architecture.

pg13.gif (186700 bytes)

The IP infrastructure must ensure smooth delivery of the voice and signaling packets to the VoIP elements. Due to their dissimilarities, the IP network must treat voice and data traffic differently. If an IP network is to carry both voice and data traffic, it must be able to prioritize the different traffic types.

There are several correlations to the VoIP and circuit-switching components, however there are many differences. One is in the transport of the resulting voice traffic. Circuit-switching telecommunications can be best classified as a TDM network that dedicates channels, reserving bandwidth as it is needed out of the trunk links interconnecting the switches. For example, a phone conversation reserves a single DS-0 channel, and that end-to-end connection is used only for the single conversation.

IP networks are quite different from the circuit-switch infrastructure in that it is a packet-network, and it is based on the idea of statistical availability. Class of service (CoS) ensures that packets of a specific application are given priority. This prioritization is required for real-time VoIP applications to ensure that the voice service is unaffected by other traffic flows.

Voice Protocols and Usages                                TOP

There are a variety of VoIP products and implementations with a wide range of features that are currently deployed. Two major standards bodies govern multimedia delivery (voice being one type) over packet-based networks: ITU and Internet Engineering Task Force (IETF).

Some of the implementations are focused on the ITU specifications more than that of the IETF standards. Also, because of the overlap and co-development in many of the standards, there are implementations from both groups of technologies. Still, some vendors are implementing proprietary schemes that fill apparent gaps in the standards or add functionality that is product dependent. However, not all the standards fall all into one or the other group; many of the standards in both bodies are based on solving the same problems. The result is some overlap of functionality, as well as differences in approach and nomenclature.

Several VoIP protocols and options exist, such as described in the following sections. Not all of the protocols are used in one specific product group. Instead, the product vendor will code its offerings with what is most applicable for its, scope, services, and market.

Each of these protocols has its own strengths and weaknesses with a different approach to service delivery. Each of these protocols is successful in different products having a specific market focus. Those protocols listed in this section are not exhaustive; there are a few other protocol options available.

Signaling System Seven                                                    TOP

SS7 (Figure 3) is a widely used suite of telephony protocols expressly designed to establish and terminate phone calls. The SS7 signaling protocol is implemented as a packet-switched network. SS7 networks are intended to be out-of-band from that of the voice network itself.

SS7 is both the protocol and the network designed to signal voice services. The importance of the system is that it is a unified interface for the establishment of circuit-switching, translation, and transaction (billing) services.

ss7.gif (144185 bytes)

SS7 is not built on top of other protocols; rather, it is completely its own protocol suite from physical to application layers. For networks transporting SS7, it is important that these services are ether translated or tunneled through the IP network reliably. Given the importance of SS7 signaling, it is necessary to ensure that these messages are given priority in the network.

VoIP networks might need access to the SS7 facilities to conduct calls that are bridged to the legacy telephony (PSTN) system. How much access is decided by the robustness of the service deployed. Generally, you might need to plan for some amount of SS7 integration to deploy even the most basic of phone services.

 

H.323                                                                                                TOP

The ITU recommendation H.323 is a packet-based multimedia communication system that is a set of specifications. These specifications define various signaling functions, as well as media formats related to packetized audio and video services.

H.323 standards were generally the first to classify and solve multimedia delivery issues over LAN technologies. However, as IP networking and the Internet became prevalent, many Internet RFC standard protocols and technologies were developed and based on some of the previous H.323 ideas. Today there is co-operation between the ITU and IETF in solving existing problems, but it is fair to say that the RFC process of furthering the standards has had greater success than the H.323 counterparts.

H.323 networks consist of (media) gateways and gatekeepers. Gateways serve as both H.323 termination endpoint and interface with non-H.323 networks, such as the PSTN. Gatekeepers function as a central unit for call admission control, bandwidth management, and call signaling. A gatekeeper and all its managed gateways form an H.323 zone. Although the gatekeeper is not a required element in H.323, it can help H.323 networks to scale to a larger size by separating call control and management functions from the gateways.

H.323 specifications tend to be heavier and with an initial focus in LAN networking. These standards have some shortcomings in scalability, especially in large-scale deployments. One of the issues of H.323 scalability is its dependency on TCP-based (connection-oriented) signaling.There is a challenge in maintaining large numbers of TCP sessions because of the greater overhead involved. However, note that most H.323 scalability limitations are based on the prevalent version two of the specification. Subsequent versions of H.323 have a focus on solving some of these problems.

With each call that is initiated, a TCP session (H.225.0 protocol) is created using an encapsulation of a subset of Q.931 messages. This TCP connection is maintained for the duration of the call.

A second session is established using the H.245 protocol. This TCP-based process is for capabilities exchange, master-slave determination, and the establishment and release of media streams. This group of procedures is in addition to the H.225.0 processes.The H.323 quality of service (QoS) delivery mechanism of choice is the Resource Reservation Protocol (RSVP). This protocol is not considered to have good scaling properties due to its focus and management of individual application traffic flows.

Although H.323 many not be well suited in service provider spaces, it is well positioned for deployment of enterprise VoIP applications. As a service provider, you might find it necessary to bridge, transport, or interface H.323 services and applications to the PSTN.

h323.gif (246989 bytes)

Real-time Transport Protocol                                                    TOP

RFC 1889 and RFC 1890 cover the RTP, which provides end-to-end delivery services for data with real-time characteristics, such as interactive audio and video. Services include payload type identification, sequence numbering, time stamping, and delivery monitoring.

The RTP protocol (Figure 5) provides features for real-time applications, with the ability to reconstruct timing, loss detection, security, content delivery and identification of encoding schemes. The media gateways that digitize voice use the RTP protocol to deliver the voice (bearer) traffic. For each participant, a particular pair of destination IP addresses defines the session between the two endpoints, which translates into a single RTP session for each phone call in progress.

rtp.gif (141199 bytes)

RTP is an application service built on UDP, so it is connectionless with best-effort delivery. Although RTP is connectionless, it does have a sequencing system that allows for the detection of missing packets. As part of its specification, the RTP Payload Type field includes the encoding scheme that the media gateway uses to digitize the voice content. This field identifies the RTP payload format and determines its interpretation by the CODEC in the media gateway. A profile specifies a default static mapping of payload type codes to payload formats. These mappings represent the ITU G series of encoding schemes.

With the different types of encoding schemes and packet creation rates, RTP packets can vary in size and interval. You must take RTP parameters into account when planning voice services.

All the combined parameters of the RTP sessions dictate how much bandwidth is consumed by the voice bearer traffic. RTP traffic that carries voice traffic is the single greatest contributor to the VoIP network load.

 

Real-time Transport Control Protocol                                        TOP

Real-time Transport Control Protocol (RTCP) is the optional companion protocol to RTP; it is not needed for RTP to work. The primary function of RTCP is to provide feedback on the quality of the data distribution being accomplished by RTP. This function is an integral part of the RTP's role as a transport protocol and is related to the flow and congestion control functions of the network. Although the feedback reports from RTCP do not tell you where problems are occurring (only that they are), they can be used as a tool to locate problems. With the information generated from different media gateways in the network, RTCP feedback reports enable you to evaluate where network performance might be degrading.

RTCP enables you to monitor the quality of a call session by tracking packet loss, latency (delay), jitter, and other key VoIP concerns. This information is provided on a periodic basis to both ends and is processed per call by the media gateways.

Some gateway devices might not employ RTCP because the facility to report such information is not applicable to the end user. For example, a single residential user (with an analog phone) might not have access to the gateway providing the service. Also, the media gateway vendor can use a more scalable approach of tracking call quality statistics. In this case, the storage, transport and presentation of statistical info are device dependent.

If using RTCP (or a vendor specific implementations) in the network, take into account bandwidth calculations for the protocol. You need to limit the control traffic of RTCP to a small and known fraction of the session bandwidth. It should be small so as not to impair the ability of the transport protocol to carry data. Investigate the amount of bandwidth needed so that you can include the control traffic in the bandwidth specification. RFC specifications recommend that the fraction of the session bandwidth allocated to RTCP be fixed at five percent of RTP traffic.

Media Gateway Control Protocol                                            TOP

The Media Gateway Control Protocol (MGCP, RFC 2705) follows more of the softswitch architecture philosophy. It breaks up the role of traditional voice switches into the components of media gateway, media gateway controller, and signaling gateway functional units. This facilitates the independent managing of each VoIP gateway as a separate entity.

MGCP is a master-slave control protocol that coordinates the actions of media gateways (Figure 6). The media gateway controller in MGCP nomenclature is sometimes referred to as a call agent. The call agent manages the call-related signaling control intelligence, while the media gateway informs the call agent of service events. The call agent instructs the media gateway to create and tear down connections when the calls are generated. In most cases, the call agent informs the media gateways to start an RTP session between two endpoints.

mgcp.gif (167675 bytes)

The signaling performed by the call agent and gateways is in the form of structured messages inside UDP packets. The call agent and media gateways have retransmission facilities for these messages; however, the MGCP itself is stateless. Hence, messages are timed out by the VoIP components if a message is lost. (Compare this mechanism to a TCP delivery mechanism where the protocol attempts to retransmit in the case of packet loss.) Therefore, it is important that you treat MGCP messages with greater priority over that of non-real-time so that packet loss does not equate to service interruptions.

Session Initiation Protocol                                                        TOP

The Session Initiation Protocol (SIP, RFC 2543) is part of IETF's multimedia data and control protocol framework. SIP is a powerful client-server signaling protocol used in VoIP networks.

SIP handles the setup and tear down of multimedia sessions between speakers; these sessions can include multimedia conferences, telephone calls, and multimedia distribution. SIP is a text-based signaling protocol transported over either TCP or UDP, and is designed to be lightweight. It inherited some design philosophy and architecture from the Hypertext Transfer Protocol (HTTP) and Simple Mail Transfer Protocol (SMTP) to ensure its simplicity, efficiency and extensibility. SIP uses invitations to create Session Description Protocol (SDP) messages to carry out capability exchange and to setup call control channel use. These invitations allow participants to agree on a set of compatible media types.

SIP supports user mobility by proxying and redirecting requests to the user's current location. Users can inform the server of their current location (IP address or URL) by sending a registration message to a registrar. This function is powerful and often needed for a highly mobile voice user base.

sip.gif (189889 bytes)

The SIP client-server application has two modes of operation; SIP clients can ether signal through a proxy or redirect server.

sip redirector.gif (217053 bytes)

One of the greatest challenges to implementing SIP services is mapping CoS delivery for the signaling and bearer traffic. With its mobile features, SIP implementations tend to be more discrete; SIP clients tend to be larger in number and more geographically distributed. You can identify SIP users, and hence CoS mappings, by having the clients set ToS bits in the IP header.

Signaling Transport                                                            TOP

Signaling Transport (SigTran) is a working draft within the IETF (informational RFC 2719) that addresses the problem of signaling performance and signaling transport (SS7-to-VoIP).

SigTran was defined to be the control protocol between the signaling gateway (for terminating the signaling associated with a given PSTN channel/circuit) and media gateway controllers.

SigTran functionality also can also relay SS7 signaling messages through an IP network to PSTN termination on both ends.

SigTran usually manifests itself as a signaling gateway controller. These devices directly bridge the SS7 network to the VoIP network. SigTran is important to ensuring interoperability so as to seamlessly allow heterogeneous networks to function, which is critical when VoIP phone calls having an end-to-end flow terminate in legacy connections (PSTN-VoIP-PSTN).

 

SigTran messages need the greatest of priority for the VoIP networks to function correctly. The signaling and media gateway controllers are generally non-changing entities; once configured, they do not change locations or addresses. Since the sources and destinations of SigTran messages are rather static, classifying the signaling to CoS mechanisms is relatively straightforward.

Megaco/H.248                                                                        TOP

Megaco/H.248 is a current draft standard and represents a cooperative proposal from the IETF and ITU standards bodies. Megaco has many similarities to MGCP and borrows the same naming conventions for the VoIP elements. The Megaco architecture defines media gateways that provide media conversion and sources of calls, while media gateway controllers provide call control.

Megaco addresses the same requirements as that of MGCP and as a result, there is some effort to merge the protocols. It defines a series of transactions coordinated by a media gateway controller for the establishment of call sessions.

The primary focus of Megaco is the promotion to standardize IP telephony equipment. Some of the design goals are as follows.

Resource Reservation Protocol                                                TOP

RSVP (RFC 2205 covers version one) is not specifically a VoIP protocol; rather, it started as a mechanism to enable QoS delivery over router-based networks for multimedia applications.

RSVP was originally created to support reservation of resources (bandwidth or links) for specific applications. Each application signaled the network elements of its intention of using network resources by sending an RSVP request. This request enabled the resource to be used along the path of the traffic flow. The routers would in turn identify the specific application by its address, protocol type and port numbers. A packet scheduler or some other link-layer-dependent mechanism would be used to determine when particular packets were forwarded. So that guarantees can be met, RSVP reservations are half duplex, needing two requests going in both directions for full-duplex operations. In the support for thousands of phone calls, RSVP being used in this capacity is not a scalable solution for large-scale VoIP networks. Reserving resources on a per-call basis is an enormous burden on the intervening routers because of the inherent overhead of identifying, classifying, and scheduling IP microflows.

RSVP was extended for use as a signaling protocol for the setup of label switched paths (LSPs) in MPLS domains. Here, RSVP does not reserve bandwidth or a resource on a flow-by-flow basis, but rather enables the use of MPLS. In this capacity, RSVP signaled LSPs aggregates large traffic flows for VoIP services. This use of RSVP to set up LSPs does not have the same scalability issue as its use in application-level signaling because of the signaling of the LSP is a one-time event and does not affect packet scheduling.

VoIP Service Considerations                                    TOP

VoIP traffic has a number of issues that you must carefully consider, such as traffic parameters and network design. Without such due diligence, you could be faced with service that does not function reliably or is severely degraded. These important considerations are as follows.

Latency

Latency (or delay) is the time that it takes a packet to make its way through a network end to end. In telephony terms, latency is the measure of time it takes the talker's voice to reach the listener's ear. Large latency values do not necessarily degrade the sound quality of a phone call, but the result can be a lack of synchronization between the speakers such that there are hesitations in the speaker' interactions.

Generally, it is accepted that the end-to-end latency should be less than 150 ms for toll quality phone calls. To ensure that the latency budget remains below 150 ms, you need to take into account the following primary causes of latency. When designing a multiservice network, the total delay that a signal or packet exhibits is a summation of all the latency contributors. 

Jitter                                                                                            TOP

Jitter is the measure of time between when a packet is expected to arrive to when it actually arrives. In other words, with a constant packet transmission rate of every 20 ms, every packet would be expected to arrive at the destination exactly every 20 ms. This situation is not always the case. For example, Figure 9 shows packet one (P1) and packet three (P3) arriving when expected, but packet two (P2) arriving 12 ms later than expected and packet four (P4) arriving 5 ms late.

jitter.gif (179018 bytes)

The greatest culprit of jitter is queuing variations caused by dynamic changes in network traffic loads. Another cause is packets that might sometimes take a different equal-cost link that is not physically (or electrically) the same length as the other links.

Media gateways have play-out buffers that buffer a packet stream so that the reconstructed voice waveform is not affected by packet jitter. Play-out buffers can minimize the effects of jitter, but cannot eliminate severe jitter.

Although some amount of jitter is to be expected, severe jitter can cause voice quality issues because the media gateway might discard packets arriving out of order. In this condition, the media gateway could starve its play-out buffer and cause gaps in the reconstructed waveform.

Bandwidth                                                                                    TOP

You can determine how much bandwidth to set aside for voice traffic using simple math. However, in a converged voice and data network, you have to make decisions on how much bandwidth to give each service. These decisions are based on careful consideration of your priorities and the available bandwidth you can afford. If you allocate too little bandwidth for voice service, there might be unacceptable quality issues. Another consideration is that voice services are less tolerant to bandwidth depletion than that of Internet traffic. Therefore, bandwidth for voice services and associated signaling must take a priority over that of best-effort Internet traffic.

If a network were to use the same prevailing encoding (CODEC) scheme as the current PSTN system, bandwidth requirements for VoIP networks would tend to be larger than that of a circuit-switched voice network of similar capacity. The reason is the overhead in the protocols used to deliver the voice service. Typically, you would need speeds of OC-12c/STM-4 and higher to support thousands of call sessions. However, VoIP networks that employ compression and silence suppression could actually use less bandwidth than a similar circuit-switched network. The reason is because of the greater granularity in bandwidth usage that a packet-based network has in comparison to a fixed, channel size TDM network.

Allocations of network bandwidth are based on projected numbers of calls at peak hours. Any over-subscription of voice bandwidth can cause a reduction in voice quality. Also, you must set aside adequate bandwidth for signaling to ensure that calls are complete and to reduce service interruptions.

The formula for calculating total bandwidth needed for voice traffic is relatively straightforward. The formula to calculate RTP bearer voice bandwidth usage for a given number of phone calls is as follows.

bits per sec = packet creation rates per sec x packet size x number of calls x 8 bits per sec

where samples per sec = 1,000 ms / packet creation rate

Example: 2,000 full-duplex G.711 encoded voice channels that have a packet creation rate of 20 ms, with a packet size of 200 bytes (40 byte IP header + 160 byte payload)

50 samples per second = 1,000 ms / 20 ms

160 Mbps = 50 x 200 x 2,000 x 8

 

Note that this number is a raw measure of IP traffic and does not take in account the overhead used by the transporting media (links between the routers) and data-link layer protocols. Add this raw IP value to that of the overhead to determine the link speeds needed to support this number of calls. Note this value represents only the bearer (voice) content. Signaling bandwidth requirements vary depending on the rate at which the calls are generated and signaling protocol used. If a large number of calls are initiated in a relatively short period, the peak bandwidth needs for the signaling could be quite high. A general guideline for the maximum bandwidth requirement that an IP signaling protocol needs is roughly three percent of all bearer traffic.

 Using the previous example, signaling bandwidth requirements if all 2,000 calls were initiated in one second would be approximately 4.8 Mbps (3 percent of 160-megabits). With the calculation of bearer and signaling, the total bandwidth needed to support two thousand G.711 encoded calls would be an approximate maximum of 164.8 MB. This bandwidth requirement is a theoretical maximum for this specific case. If the parameters change, such as call initiation rate, voice encoding method, packet creation rate, employment of compression, and silence suppression, the bandwidth requirements would change as well.

With large VoIP implementations requiring sizable bandwidth, it becomes imperative that the IP network delivers the needed service at predictably high performance.

Packet Loss                                                                                TOP

Packet loss occurs for many reasons, and in some cases, is unavoidable. Often the amount of traffic a network is going to transport is underestimated. During network congestion, routers and switches can over flow their queue buffers and be forced to discard packets. Packet loss for non-real-time applications, such as Web browsers and file transfers, are undesirable, but not critical. The protocols used by non-real-time applications, usually TCP, are tolerant to some amount of packet loss because of their retransmission capabilities.

Real-time applications based on the UDP are significantly less tolerant to packet loss. UDP does not have retransmission facilities, however, retransmissions would almost never help. In an RTP session, by the time a media gateway could receive a retransmission, it would no longer be relative to the reconstructed voice waveform; that part of the waveform in the retransmitted packet would arrive too late.

It is important that bearer and signaling packets are not discarded, otherwise, voice quality or service disruptions might occur. In such instances, CoS mechanisms become very important.

By configuring CoS parameters, you can give packets of greater importance a higher priority in the network, thus ensuring packet delivery for critical applications, even during times of network congestion.

Although packet loss of any kind is undesirable, some loss can be tolerated. Some amount of packet loss for voice services could be acceptable as long as the loss is spread out over a large amount of users. As long as the amount of packet loss is less than five percent for the total number of calls, the quality generally is not adversely affected. It is best to drop a packet, versus increasing the latency of all delivered packets by further buffering them.

Reliability                                                                                TOP

Although network failures are rare, planning for them is essential. Failover strategies are desirable for cases when network devices malfunction or links are broken. An important strategy is to deploy redundant links between network devices and/or to deploy redundant equipment. To ensure continued service, plan carefully for how media gateways and media gateway controllers can make use of the redundant schemes.

IP networks use routing protocols to exchange routing information. As part of their operation, routing protocols monitor the status of interconnecting links. Routing protocols typically detect and reroute packets around a failure if an alternate path exists. Depending on the interconnecting media used for these links, the time taken to detect and recalculate an alternate path can vary. For example, the loss of signal for a SONET/SDH connection can be detected and subsequently rerouted very quickly. However, a connection through an intervening LAN switch might need to time out the keep-alive protocol before a failure is detected.

Having media gateways and media gateway controllers that can actively detect the status of their next-hop address (default gateway) as part of their failover mechanism decreases the likelihood of a large service disruption. Another possible option is that the media gateway and media gateway controller could be directly connected to the router. In this case, the possibility of a link failure (depending on the nature of the failure) could be immediately detected and the network devices would take appropriate action. Still another option for reducing long-term failure could be to employ a redundancy mechanism such as the Virtual Router Redundancy Protocol (VRRP).

Security                                                                                    TOP

Security, especially in a converged voice and data network, is a high priority. You need to protect the voice communications devices from unauthorized access and malicious attack.

While you can thwart unauthorized access by using security protocols (such as RADIUS and ssh), denial-of-service (DoS) attacks can be a real danger to voice services. It is conceivable that such attacks would either cripple or completely disable voice services.

One method of ensuring that DoS attacks are not successful is to use private addressing for the VoIP devices. Private addressing (RFC 1918) can keep Internet-based attacks from happening because private addresses are not routable (not advertised) in the public Internet. (This method would actually only work for most of the Internet and not directly connected autonomous systems because of the possibility of default routing.) You can use private addressing only if the in-band VoIP service never crosses autonomous-system (AS) boundaries. Interface outside the network would be through the ties to the PSTN.

If any part of the VoIP service needs access to the Internet, you can configure packet filtering to provide protection. Through packet filtering, you can selectively allow the VoIP devices to communicate with each other while denying traffic from possible attackers. It is important that whatever packet filtering is employed does not impact the network performance. Any gains made in protecting the equipment from attack could be lost with routers / switches that cannot perform filtering without compromising performance. Security of the actual IP network is also an important consideration. A malicious attack on an IP network, specifically the routers that carry the traffic, could compromise the network services. Someone using a router or similarly capable device connected to the network could spoof the routing protocols and cause disruption.

Economic Issues                                                                              TOP

Any discussion of economic issues associated with Internet access must consider the method of access used. Thus, prior to discussing the economic issues associated with Internet access, let's examine the two basic methods used to obtain such access.

Basic Access Methods
There are two basic methods associated with Internet access: dial-up and direct connection. Dial-up access is based on transmission using the Serial Line Interface Protocol (SLIP) or the Point-to-Point Protocol (PPP) to an Internet Service Provider's (ISP's) network access device.

SLIP vs. PPP
A key difference between SLIP and PPP is the fact that the Serial Line Interface Protocol requires the user to know the IP address assigned to their computer by their Internet Service Provider as well as the IP address of the remote system the computer will dial into. If the ISP assigns IP addresses dynamically the SLIP software operating on the local computer must be able to adjust to automatic IP address assignments. In addition, the local computer operator may have to configure such TCP/IP parameters as the MTU (maximum transmission unit); MRU (maximum receiver unit); the use of Van Jacobson (VJ) compression, which results in SLIP functioning as Compressed SLIP (CLSIP); and other features. Recognizing that the configuration of such parameters was beyond the area of knowledge of the growing base of non-technical dial-up users of the Internet resulted in the rationale for PPP/ which represents a newer protocol.

PPP has several benefits over SLIP. Those benefits include negotiating configuration parameters at the start of a connection and the support of two security methods for login to a remote system. Because PPP negotiates configuration parameters at the beginning of a session its use considerably simplifies the configuration of a PPP connection. Concerning security, PPP supports the Password Authentication Protocol (PAP) and the Challenge-Handshake Authentication Protocol (CHAP). The selection of either method permits the local computer to automatically transmit a previously entered user-ID and associated password to the remote system. Thus, a majority of serial point-to-point Internet communications occurs via the use of PPP today. We can obtain an appreciation for its capability by briefly examining the structure of the PPP frame.

Pricing Structure                                                                                                                TOP
The PPP frame represents a frame structure first standardized by the International Standards Organization (ISO) for High Level Data Link Control (HDLC). In fact, PPP uses the HDLC protocol as a basis for encapsulating datagrams over point-to-point circuits. 

In examining the frame below any familiarity with HDLC will allow you to shortly note that a PPP frame is a streamlined version of HDLC. The Flag field represents the bit sequence 01111110 and indicates the beginning or end of a frame. Similar to HDLC/ PPP will insert a binary 0 when a sequence of five Is occurs naturally in the data to prevent its misinterpretation as a Flag field, a process referred to as zero insertion. PPP will also remove the inserted 0 at the receiver, performing insertion and removal transparent to the user.

Flag

Address

Control

Protocol  

Data

PCS

   1        1          1            2           Variable     2 or 4
                        Field Length in Bytes

Unlike an HDLC frame that will transport different addresses in its Address field, PPP always uses the fixed binary sequence 11111111, which represents a standard broadcast address. Because PPP is used on point-to-point circuits, it does not assign individual station addresses. In comparison, HDLC can be used to address multiple devices and normally conveys a specific device address.

The PPP Control field contains the fixed binary sequence 00000011. This value denotes the transmission of user data in an unsequenced frame. Because PPP is restricted to point-to-point circuits, it does not need to transmit sequenced frames or supervisory frames, with the latter two types of frames supported by HDLC.

The two-byte protocol field results in PPP having the capability to encapsulate other protocols besides IP. This field is followed by a variable length Data field. The default maximum length of this field is 1500 bytes, which is also the maximum length of the Information field in an Ethernet frame. Finally, the Frame Check Sequence (PCS) field terminates the frame. This field is normally 2 bytes; however, PPP supports a 32-bit (4-byte) PCS for improved error detection.                            TOP

Regardless of the protocol used, both SLIP and PPP operating devices will dial an ISP access device. The ISP's network access device physically consists of a series of rack-mounted modems connected to a communications server. The server represents one of several devices connected to a local area network, with a router connected to the LAN, while the router's serial port is used to provide a high-speed communications connection from the ISP to an Internet network service provider (NSP). The NSP typically operates a high-speed backbone connection that provides interconnectivity between ISPs.

Dedicated access is normally associated with the connection of a group of subscribers located within a building or university campus. Under this access method, subscribers are connected to a corporate LAN, and the local area network is in turn connected via the use of a router and leased line to an ISP. Instead of the line terminating at an ISP's communications server, the leased line terminates at a multi-port router connected to a LAN. Each subscriber PC commonly operates a browser on top of a TCP/IP protocol stack, either purchased from a third-party provider or obtained from operating systems that include built-in stacks, such as Windows 95, Windows 98, or Windows NT/Windows 2000. the picture below illustrates the two primary methods used for accessing the Internet.

eccnomic-issues.gif (195291 bytes)

The picture above illustrates the two primary methods used for accessing the Internet.
Now we will turn our attention to the pricing structure of each method.

Pricing Structure

On an individual dial-up connection, many ISPs offer a flat-fee pricing structure, typically $19.95 to $21.95 per month for unlimited use. When this type of pricing structure is used, there is essentially no additional cost associated with transmitting voice over the Internet, other than the one-time cost for hardware and software and fees charged by certain vendors that now offer a voice gateway service. The voice gateway service enables calls routed via the Internet to be dialed to their ultimate destination via the public switched telephone network, as illustrated in the below picture. In examining the picture it should be noted that the voice gateway is programmed to accept calls from predefined accounts or a pay-as-you-use account based on the use of "digital cash" or through the use of a credit or debit card. Once access is authorized, the voice gateway will out-dial the desired telephone number, billing the user for a local call and surcharge or for a long-distance call that's more economical because the gateway operator can purchase a block of minutes at a lower rate than individuals can obtain.
eccnomic-issues2.gif (162409 bytes)                                                                                                                                   
TOP

The use of a voice gateway is most effective for conducting international long-distance calls. For example, a voice gateway provider might bill calls received via the Internet for a gateway service in London at $.10 per minute. If you were calling via an 1SP connection in New York, you could avoid a long-distance international call between New York and London that could cost between $.25 and $1.10 or more per minute, depending on the time of day the call is made. Thus, you might be able to save between $.15 and $1.00 per minute for this type of call.

One new "wrinkle" concerning ISPs that deserves mention is the growth in advertiser-supported free Internet access. During 2000 several popular totally free Internet access providers had accumulated approximately 10 million subscribers. In fact, both this author and his wife were Bluelight fanatics, using the joint Kmart-Yahoo! advertiser-sponsored free Internet access service on a daily basis.

Internet Telephony Operations                                                        TOP

The actual PC configuration required for Internet telephony operations depends on the software program you are using. The software program operates under a specific operating system (such as a version of Microsoft Windows or the Apple Computer Macintosh System 7.x), interfaces with a sound card to compress voice input entered through a microphone connected to the card, and coordinates a TCP/IP SLIP or PPP communications connection via the use of a modem. Originally, most sound cards were half-duplex, but today most manufactured sound cards can operate in a full-duplex mode. Thus, obtaining a full-duplex telephone connection requires the use of a full-duplex sound card as well as software that supports full-duplex operations.

Hardware and Software Requirements

The table below  lists the general categories of hardware and software required for Internet telephony.

Processor and RAM Concerning specific hardware/ at a minimum, most programs require a high-performance 486 processor or equivalent since the use of the processor to perform voice compression is processor-intensive. The actual process and RAM memory required is commonly noted in the program specification sheet for a product.

Software

Telephony program TCP/IP stack

Operating system Macintosh, Windows

Hardware

Sound card (half- or full-duplex) Speaker Microprocessor

Microphone Computer platform RAM

General Categories of Hardware and Software Required for Internet Telephony                                    TOP

Modem
Since the transmission of digitized voice involves some overhead resulting from the framing of voice packets, including header and trailer fields, most Internet telephony programs require the use of a modem that operates at a minimum data transfer rate of 14.4 Kbps. To put this operating rate in perspective, it is equivalent to 1800 bytes per second, while voice encoding using PCM requires 8000 bytes per second of bandwidth, which is reduced to 4000 bytes per second of bandwidth when ADPCM coding is used. Clearly, then. Internet telephony depends on the use of a vocoding or hybrid coding technique to enable 14.4-Kbps modems to support a digitized voice transmission capability. In fact, the method of voice digitization can differ between vendor products, and this is one of several issues that currently result in a high degree of non-interoperability between different vendor products.

While most Internet telephony products can operate at a data transfer rate of 14.4 Kbps, the quality of reconstructed voice considerably improves as the modem operating rate increases. If you use a V.90 or the newer V.92 modem you will obtain the ability to download data at approximately 42 Kbps for each modem while uploading can be accomplished at 33.6 Kbps for a V.90 and 40.0 Kbps when a V.92 modem is used. Because both the V.90 and V.92 modems provide more than double the operating rate of a symmetrical 14.4 Kbps modem, their use will significantly reduce latency on the access line. This may be sufficient to convert a marginal voice-over-IP call into one that provides a high quality of audio reconstruction.

Sound Card
Through the use of a full-duplex sound card and software support, you can enable both parties to talk at the same time. In actuality, a full-duplex communications capability is slightly better than a half-duplex communications capability because you are able to gracefully back out of a conversation instead of having to awkwardly wait to ascertain if a person is done speaking as in half-duplex communications. Some programs enable the use of two sound cards, one for playback and one for recording the conversation. If you have an extra sound card and your program supports the use of two cards, you can avoid the purchase of a full-duplex card.

Although you might be tempted to avoid the use of a full-duplex sound card because of possible bandwidth problems, you should note that the actual bandwidth required may increase by only a few percent over the use of a half-duplex sound card. This is because many Internet telephony programs that support full-duplex operations also use silence suppression, transmitting data only when a person actually speaks.

Directory Services                                                                                TOP

Originally, many Internet telephony software products were based on the use of a "directory service" server. To initiate a call, you use the program operating on your computer to access the software vendor's directory and double-click on an entry. The other party must be logged on to an Internet Service Provider and running the same software on his or her computer for the call to be received. Other Internet telephony software products provide a directory with a list of online users and chat rooms as a mechanism to allow users to meet new friends around the globe. Other programs either include a direct access capability or allow direct access after you access their central server. If you know the IP address of a user you want to call directly  you can simply enter his or her IP address and click on a call button on the program interface to initiate a call. If the called party is logged on to the Internet and running the same software program the call will be received.

A relatively new addition to Internet telephony is the incorporation of telephone software into different Internet messenger products, such as Yahoo! Messenger. Yahoo! Messenger and similar products that added a "Call" capability were using the facilities of Net2Phone or another native communications carrier which operates a series of gateways in larger cities across the globe. In the case of Yahoo! Messenger, users of this service could make free telephone calls to any telephone in the United States without charge. Calls to overseas locations could be accomplished at a substantial discount to the cost associated with the use of one of the big three long-distance companies. Now that we have an appreciation for the basic components required to implement Internet telephony , let's look at some of the constraints and compatibility issues associated with this relatively new technology.

Constraints and Compatibility Issues                                            TOP

There are a number of constraints associated with the use of Internet telephony/ and they are solved in a variety of ways by different vendors. Unfortunately, the lack of a uniform approach to solving these constraints results in a number of compatibility issues that make interoperability between different vendor products difficult, if not impossible. We will examine each of these groups in this section.

Bandwidth Conservation Method The ability to reproduce a natural-sounding conversation requires a trade-off between the speech-coding scheme and processing power of the PC. In general, the ability to highly compress speech while enabling the reproduction of it to produce natural-sounding conversation requires more processing power than encoding schemes that reproduce either synthetic sound or produce digitized speech at a higher data rate.

There are three voice-coding techniques used in different popular Internet telephony applications:

Until recently the G.729 coding method was very popular. However, the standardization of the G.723.1 dual-speed coding method is gaining in popularity and is also recommended as the low-bit-rate speech coder for the ITU H.323 standard for video and voice communications over packet-based networks. At one time, it was expected that most Internet telephony products would adopt the G.723.1 Recommendation. However, from a real-world operational perspective, the delay or latency associated with the G.723.1 dual-speed coding method is approximately 60 ms/ which is 12 times the delay associated with the use of the LD-CELP coding method. In many situations, the additional delay is of sufficient duration to cause reconstructed voice to sound awkward. Thus, the trade-off between bandwidth and delay can represent a key item you may wish to consider when configuring an Internet telephony product that supports multiple voice-coding methods. Many times, a voice-coding method that results in awkward-sounding reconstructed voice at one operating rate may provide a higher quality of reconstructed voice when you select a higher operating rate.
Bandwidth conservation method  
G.729
G.723.1
G.728
Packet delay and loss handling  
Repair lost packets with silence Repair lost packets with synthetic speech
LAN connectivity operation  
Requires modification to firewall and router access lists Products use different ports
Connection method  
Directory or IP address-based Protocols used Gateway based

Packet Delay and Loss Handling                                                                                                TOP
Two key considerations associated with the use of packet networks affect voice transmission: packet delay and packet loss. The congestion of routers and gateways-resulting from either processing packets or the inability to transfer packets onto communications facilities due to heavy line utilization-causes delay or loss of packets.

When data is being transferred, a slight delay in the arrival of one or more packets is usually not noticeable. At worst, it might result in a person waiting an extended period of time for a file transfer to complete, but the content of the transferred file would not be affected. Similarly, the loss of packets as they flow through an IP network resulting from congestion and routers, workstations, or gateway discarding packets is compensated for by the retransmission of discarded packets. However, when packets transport digitized voice, normal data transmission methods cannot be used. This is because the loss or delay of packets results in the disruption of speech intelligibility.

There are two methods that can be used to compensate for the loss or delay of packets. Those methods involve repairing lost or delayed packets with periods of silence or with synthetic speech.

Silence Generation
Currently, most Internet telephony applications simply generate periods of silence to compensate for lost packets and reproduce delayed packets. This results in the clipping of speech and a loss of its intelligibility when packets are lost in the network, and a distortion of speech when delayed packets are used to reproduce speech. 

Voice Reconstruction                                                                                                            TOP
Voice reconstruction can occur by the receiver attempting to reconstruct the missing segments of speech from correctly received packets preceding the packet or from packets that are lost or delayed. This can be accomplished by the repetition of a portion of the last correctly received speech waveform or via the interpolation process. When a combined transmitter and receiver method is used, extra information is included within each transmitted packet to facilitate the reconstruction and interpolation process at the receiver. Another combined transmitter and receiver technique involves the adjustment of packet sizes dynamically, based on packet delay and packet loss metrics. Making packets smaller enhances their ability to flow through a packet network, since many routers and gateways use queues that favor small-size packets. Currently, no standards exist concerning the handling of lost or delayed packets, and it may be several years until an approach is standardized. This means that the selection of a product that provides this capability should have an option to turn off lost and delayed packet handling if it is to interoperate with other products that do not offer this feature or that implement it in a different manner.

LAN Connectivity Operation                                                            TOP

Although most Internet telephony products were originally developed to use SLIP and PPP dial connections to ISPs, many products now support the use of LAN connections, enabling the product to recognize gateway and Domain Name Server (DNS) addresses configured with the TCP/IP protocol stack operating on a computer. Since Internet telephony products can use both TCP and UDP ports for establishing connections to a directory server or directly to a distant party, the use of those ports may cause conflicts with existing firewalls or router access lists designed to provide a level of security to organizational computational equipment located behind those devices.

Security Considerations
To illustrate the effect of security implemented in the form of access lists on voice traffic consider the picture below, which shows the use of a firewall to protect a corporate LAN. DMZ is an acronym for demilitarized and a DMZ LAN represents a local area network that has no workstations connected to the network. This means that inbound packets from the Internet must first flow onto the DMZ LAN, from which they are received on one port of the firewall for processing prior to being placed onto the corporate LAN. By using a DMZ LAN/ the firewall is able to process every inbound packet prior to the packet being able to be received by another corporate network device.

lan-connectivity.gif (128875 bytes)

Both routers and firewalls process packets based on access lists as well as other metrics; however/ the access list can be considered the initial qualifier, determining whether the packet reaches the next processing step or is discarded. Thus, it is extremely important for router and firewall access lists to be configured to enable the use of an Internet telephony product. For example, the Vocal Tec Internet Phone product uses two channels. It uses TCP port 6670 to connect to the vendor's Internet Phone Server's directory service, while audio is passed through UDP port 22555 on both local and remote computers. Internet Phone also uses TCP port 25793 to connect to the Vocal Tec addressing server and TCP port 1490 for whiteboard, chat/ and file transfer when its conferencing option is used.

TOP

Virtual Private Networks A Technology Overview

What is a Virtual Private Network?

A Virtual Private Network (VPN) is a network that uses the Internet or other network service as its Wide Area Network (WAN) backbone. In a VPN, dial-up connections to remote users and leased line or Frame Relay connections to remote sites are replaced by local connections to an Internet service provider (ISP) or other service provider's point of presence (POP). A VPN allows a private intranet to be securely extended across the Internet or other network service, facilitating secure e-commerce and extranet connections with business partners, suppliers and customers.There are three main types of VPN:

These types of VPN are shown in the following diagram.Figure 1

All of these VPNs aim to provide the reliability, performance, quality of service, and security of traditional WAN environments using lower cost and more flexible ISP or other service provider connections. VPN technology can also be used within an intranet to provide security or control access to sensitive information, systems or resources. For example, VPN technology may be used to limit access to financial systems to certain users, or to ensure sensitive or confidential information is sent in a secure way. There are many definitions of a VPN. Some of the more common definitions are as follows:

TOP

VPNs Based on IP Tunnels
VPNs based on IP tunnels encapsulate a data packet within a normal IP packet for forwarding over an IP-based network. The encapsulated packet does not need to be IP, and could in fact be any protocol such as IPX,AppleTalk, SNA or DECnet. The encapsulated packet does not need to be encrypted and authenticated; however, with most IP based VPNs, especially those running over the public Internet, encryption is used to ensure privacy and authentication to ensure integrity of data. VPNs based on IP tunnels are mainly self deployed; users buy connections from an ISP and install VPN equipment which they configure and manage themselves, relying on the ISP only for the physical connections. VPN services based on IP tunnels are also provided by ISPs, service providers and other carriers. These are usually fully managed services with options such as Service Level Agreements (SLAs) to ensure Quality of Service (QoS). A Ten Point Plan for Building a VPN shows some of the steps taken when deploying an Internet-based VPN.

The following diagram shows an Internet-based VPN that uses secure IP tunnels to connect remote clients and devices.

Figure 2

VPNs based on IP tunnels provide the following benefits:

The main disadvantage of VPNs based on IP tunnels is that QoS levels may be erratic and are not yet as high as alternative solutions. Also, for VPNs based on the public Internet, higher levels of security such as authentication and data encryption are essential to ensure integrity and security of data. Note that ISP connections used for VPNs do not necessarily need to be protected by a firewall as data is protected through tunneling, encryption, etc. Also, you can use separate ISP connections for general Internet access and VPN access, or you can use a single connection with a common router with a VPN device and firewall in parallel behind it. In some cases, you can use devices that integrate one or more of these functions.


TOP

VPNs Based on ISDN, Frame Relay or ATM
VPNs based on ISDN, Frame Relay or ATM connections are very different from VPNs based on IP tunnels. This type of VPN uses public switched data network services and uses ISDN B channels, PVCs, or SVCs to separate traffic from other users. Single or multiple B channels, PVCs, or SVCs may be used between sites with additional features such as backup and bandwidth on demand. Data packets do not need to be IP, nor do they need to be encrypted. Due to more wide-spread awareness about security issues, however, many users now choose to encrypt their data. The following diagram shows a carrier-based VPN that uses ISDN B channels and Frame Relay PVCs to connect remote clients and devices.Figure 3

VPNs based on public switched data networks are usually provided by service providers and other carriers, and may or may not provide fully managed services. In most cases, additional services such as QoS options are available. This type of VPN is likely to become particularly popular in Europe, where public switched data networks are widely available and business use of the Internet is less developed. The main benefits of VPNs based on ISDN, Frame Relay or ATM connecstions include the following:

The main disadvantages of this type of VPNs are that ISDN, Frame Relay and ATM services may be expensive and are not as widely available as ISP services. Plus, it is often harder to provide extranet and e-commerce connections to business partners, suppliers and customers.

A Note About the Term "VPN"
The term VPN is used for many different services, including remote access, data, fax, and voice over IP (VoIP). The other sections in this discussion are concerned with just two types of VPN service: remote access and intranet. However, much of the discussion on intranet QoS requirements is relevant to multimedia, including VoIP.

VPN Benefits                                               

VPNs offer considerable cost savings over traditional solutions. Find out how much you could save. VPNs cost considerably less than traditional leased line, Frame Relay or other services, because long-distance connections are replaced with local connections to an ISP's point of presence (POP), or local connections to a service provider or carrier network.

Figure 4

Reduced Costs
VPNs offer the network manager a way to reduce the overall operational cost of wide area networking through reduced telecom costs. In the case of a managed VPN service, the savings can be greater as the ISP or service provider manages the WAN equipment, allowing fewer networking staff to manage the security aspects of the VPN. In many cases, implementing a VPN also means that more use is made of an existing dedicated Internet connection.

Flexibility VPNs based on IP tunnels, particularly Internet-based VPNs, also allow greater flexibility when deploying mobile computing, telecommuting and branch office networking. Many corporations are continuing to experience explosive growth in the demand for these services. VPNs provide a low-cost and secure method of linking these sites into the enterprise network. Due to the ubiquitous nature of ISP services, it is possible to link even the most remote users or branch offices into the network.


Examples
The following examples, based on real-life costs, show how you can make significant savings by implementing VPN-based solutions. The first example shows the cost of a dial up VPN service compared to a traditional remote access solution, while the second example shows the cost of an intranet VPN solution compared to a traditional WAN solution. The final example shows the costs of an international VPN service based on an encrypted 128 Kbps Frame Relay connection compared to a 64 Kbps dedicated leased line.


TOP

Figure 5 According to Forrester's research, the cost savings of an Internet-based dial VPN solution compared to a traditional RAS approach are staggering as shown in the following table. However to assess the cost justification completely, we must also consider the potential costs of making the switch to a VPN. A VPN may not make sense if, for example, nearly all of a company's remote users need only make a local call to access the network. This is especially true in the US where local calls are free as there are no monthly usage charges.

In most European countries, however, this is not the case and a remote access solution based on ISDN may actually be cheaper than a dial VPN solution. In many European countries, ISDN tariffs are low, and extensive use of time cutting, protocol spoofing and filtering can dramatically reduce ISDN costs. See Cabletron's ISDN and Telesaving white paper for more details.                    ` TOP   

Moving to a dial VPN solution means that each remote user requires an ISP account, and the POPs must be local to the majority of the users. The cost benefits might not be as compelling if users are switched to an ISP account with a flat monthly rate but then must
incur long distance call charges to connect to the ISP's nearest POP.

Example 2--Intranet VPN Versus Leased Line and Frame Relay
There are two areas where savings can be made with an intranet VPN solution compared to a traditional WAN solution:

Based on a study by Cabletron, the following table shows the average annual savings per site on the cost of intranet VPN access compared to the cost of traditional leased line access for different types of site. Note that the costs shown in the table are for bandwidth only.

Figure 6 Based on a cost comparison alone, the reasons for moving to an intranet VPN are compelling. However, a traditional WAN based on leased lines or Frame Relay provides guaranteed levels of Quality of Service (QoS). Replacing a traditional WAN between branch offices and central sites with an intranet VPN is unlikely to give the same levels of performance and QoS to users unless the service provider is able to give throughput and latency guarantees as part of a Service Level Agreement (SLA). See Quality of Service for more information about QoS and SLAs.

Example 3-International VPN Versus International Connections
The savings are particularly evident in the cost of international connections. A 128 Kbps VPN link between London and Tokyo provided by an international ISP costs around $20,000 per year, while a 64-Kbps leased line provided by a traditional carrier can easily cost around $160,000 per year. Even an international VPN service based on Frame Relay provided by a traditional carrier costs around a third of the cost of the 64 Kbps dedicated leased line.                                                    TOP



Internet VPNs

VPNs based on the Internet are becoming widely available, especially as an alternative for dial-up remote access. Generally when people talk about VPNs, they implicitly mean an Internet-based network as an alternative to a private network based on public network services such as T1 leasedFigure 7 lines or Frame Relay. The Internet has become so ubiquitous and Internet service providers (ISPs) so numerous that it is now possible to obtain connections in all but the most remote locations. Most counties worldwide now have ISPs offering connections to the Internet, although some countries still restrict access. So it is possible for many organizations, both large and small, to consider the Internet not just for external communication with customers, business partners and suppliers, but for internal communications as well using a VPN.

Internet-based VPNs can be used to outsource remote access with significant cost savings and greater flexibility. Modem racks, remote access servers and the other equipment necessary to service the needs of remote and mobile users can be replaced with a managed service provided by an ISP (see Remote Access VPNs).

While Internet VPNs are suitable for remote access needs, there are still problems to overcome before moving to a full intranet VPN solution.Although most VPN products now offer adequate levels of security, the issue of Quality of Service (QoS) and Service Level Agreements (SLAs) remains.While most VPN service providers can offer guarantees for connectivity and uptime, few can offer adequate throughput and latency guarantees. In addition, there are few agreements between ISPs, so unless you can use a single ISP's IP backbone for all your connections, you are likely to suffer service degradation where connections cross boundaries between ISPs. Most users will not want to give up the levels of service currently offered by leased lines, Frame Relay or ATM networks for something inferior. However, in the long term these problems will be overcome, and Internet-based VPNs will become much more widespread for intranet as well as remote access. In a few years, global VPN services based on the Internet will become as cost-effective and as highly available as global Frame Relay and other public network services.

Public Network VPNs                                            TOP

Public networks such as ISDN, Frame Relay and ATM can carry mixed data types including voice, video and data. They can also be used to provide VPN services by using B channels, Permanent Virtual Circuits (PVCs) or Switched Virtual Circuits (SVCs) to separate traffic from other users. Optionally, authentication and encryption can be used where the identity of users and the integrity of data needs to be guaranteed. Using PVCs, SVCs or B channels makes it easier to provide additional bandwidth or backup when needed. The traffic shaping capabilities of Frame Relay and ATM can be used to provide different levels of QoS, and because these services are based on usage, there is significant opportunity to reduce telecom costs even further by using bandwidth optimization features.

Frame Relay in particular has become a popular, widespread and relatively low-cost networking technology that is also suitable for VPNs. Running VPNs over a Frame Relay network allows expensive dedicated leased lines to be replaced and makes use of Frame Relay's acknowledged strengths, includingFigure 8 bandwidth on demand, support for variable data rates for bursty traffic, and switched as well as permanent virtual circuits for any-to-any connectivity on a per-call basis. Frame Relay's ability to handle bursty traffic and built-in buffering means that it makes optimum use of available bandwidth, something that is important in a VPN environment where latency and performance are concerns. Frame Relay can be used to create a VPN in two ways:

Frame Relay is an end-to-end protocol that can be run over a variety of access technologies, such as ISDN, DSL (Digital Subscriber Loop), and even POTS dial-up lines. New access methods such as switched virtual circuits (SVCs), ISDN access and backup mean that Frame Relay is now a much more reliable and cost-effective solution. Frame Relay can also run over, and interoperate with,ATM backbones, making it one of the most widely available public data networking services worldwide. As a result, major service providers and carriers have created global Frame Relay networks which are cost-effective and offer high availability. When coupled with tunneling, encryption and authentication, these attributes make Frame Relay an ideal candidate for global VPN services.

Remote Access VPNs                                          TOP      

Figure 9 Remote access VPNs are rapidly replacing traditional remote access solutions as they are more flexible and cost less.

Remote access refers to the ability to connect to a network from a distant location. A remote access client system connects to a network access device, such as a network server or access concentrator. When logged in, the client system becomes a host on the network. Typical remote access clients might be:


We can divide remote access connections into two groups: local dial and long-distance dial. For traditional, private, remote access networks, local-area users connect using a variety of telecommunication data services. Remote access long-distance users rarely have a choice other than modem access over telephone networks. The aggregation devices that the clients connect to typically use channelized leased line and primary-rate ISDN, offering dedicated, circuit switched access.

With VPNs, local area users typically have a wider range of data services to choose from, regardless of the support at the enterprise or central site VPN equipment. However, long-distance connections are currently via modem access. What VPN carriers currently offer corporations are "Work Globally, Dial Locally" services. The VPN equipment will use high-speed leased lines to the nearest POP of the chosen VPN carrier and all remote access traffic can be aggregated or routed as IP datagrams over this single link.

Advantages of Remote Access VPNs over Traditional Direct-Dial Remote Access

Intranet VPNs                                               TOP     

Intranet VPNs can be used to provide cost-effective branch office networking and offer significant cost savings over traditional leased-line solutions. Intranet, or site-to-site,VPNs apply to several categories of sites, from small office/home office (SOHO) sites to branch sites to central and enterprise sites. SOHO sites could be considered as remote access users where dial services are used, but as SOHO sites often have more than one PC, they are really small LAN sites. In an intranet VPN, expensive long distance leased lines are replaced with local ISP connection to the Internet, or secure Frame Relay or ATM connections as shown in the following diagram.

Local ISP connections can be provisioned using many technologies, from dial-up POTS and ISDN for small sites, to leased lines or Frame Relay for larger sites. New emerging "last mile" technologies such as DSL, cable and wireless provide both low-cost and high-speed access. Many ISPs and service providers are now starting to support these emerging technologies for Internet access, particularly for home users and SOHO sites. The intranet market is one where traditional WAN carriers are likely to compete heavily with ISPs.Traditional WAN carriers can offer a VPN service similar to a Frame Relay service with Quality of Service (QoS) based on Committed Information Rate. Traditional WAN carriers are well placed to push their advantage in providing secure, reliable, low-latency, intranet links by adopting their current services to support routed VPN links.


VPN Issues                                                    TOP

There are a number of issues, both technological and practical, that need to be overcome before you can implement a VPN. Here are some of these issues.
For a VPN to function successfully, it must provide a number of essential features-in particular, features that solve the problems that stem from routing private data across a shared public network. The main features are discussed here.

Security
Since a VPN is a shared-access, routed network, security is the main area of concern. It will require the use of encryption, secure key exchange/re-keying, session and per-packet authentication, security negotiation, private address space confidentiality, complex filtering, and a host of other precautions.

Performance and Quality of Service (QoS)
IP datagrams sent across the VPN carrier service may experience packet loss (silent discards) and packet reordering.
Packet loss tends to be greatly increased by stateful algorithms designed for point-to-point reliable links, for example, PPP compression and encryption algorithms. Throughput may also vary from POP to POP, country to country, and even hour to hour.
Reordering will cause problems for some LAN protocols, for example, when running bridging over a VPN.

Monitoring Actual Throughput
In the absence of Quality of Service guarantees from the VPN carriers, mechanisms are required to allow performance monitoring of tunnels.

Preventing Denial of Service Attacks
Being connected to a public network, the VPN receive-data path can be clogged by unsolicited data to such an extent that no useful business can be achieved. Unlike a private leased line, traffic that is not from the peer remote site (tunnel end-point) can flood down the receive path of a VPN tunnel from anywhere on the public network. For client-based tunnels, there are no services currently.
In the case where the VPN carrier is providing the tunnel, the VPN carrier could offer to filter non-VPN traffic, or perhaps provide a bandwidth reservation service. For the L2TP VPN carrier-based approach, the client is protected by the fact that it is not reachable via the public network, as no global address is assigned

Scalability                                       TOP
The term scalability refers to how well a system can adapt to increased demands. A scalable network system is one that can start with just a few nodes but can easily expand to thousands of nodes. Scalability can be a very important feature because it means that you can invest in a system with confidence that you won't outgrow it. If VPN carriers are to succeed in VPN deployment, the technologies they use need to scale easily. The VPN customer will also require this at larger Security Gateway sites. Enterprises will need to consider:

  • The overhead associated with security mechanisms.

  • The overhead associated with encryption and compression, 
    which both require a lot of processing power. Hardware 
    compression and encryption may be needed cope with this 
    load.

  • Key management, including methods of key generation, 
    distribution and exchange.

Management
Client-based software should be as transparent as possible. VPN carriers will require new management tools in order to simplify the configuration and monitoring of a corporate customer's VPN. Also,VPN customers may well want a privileged management window into their VPN carrier-held database to make changes for themselves!

Flexibility To offer a "go anywhere"VPN service,VPN carriers are keen to provide a service that can support all protocols and all data links (e.g. PPP over anything).

Telesaving
Telesaving means making cost-effective use of WAN data services. Telesaving is appropriate to all WAN links, but is particularly useful for "pay-as-you-use" data services, for example, ISDN. For clients using this type of service to access the VPN carrier network-and from there, a tunnel server-telesaving needs to be performed from a central site (an Enterprise Security Gateway) for data links that are connected indirectly via the VPN carrier network.
New, VPN-specific, telesaving features will be needed to take advantage of the possibility of cheap bandwidth via a VPN link, while maintaining some layer of service using more expensive, private data links when needed.

Bandwidth Reservation and Quality of Service (QoS)
Bandwidth reservation and Quality of Service (QoS) refers to the ability to "reserve" transmission bandwidth on a network connection for particular classes of traffic or particular users. It allocates percentages of total connection bandwidth for specified traffic classes or users, which have given priority levels assigned to them. A bandwidth reservation algorithm is used to decide which packets to drop when there is too much network traffic for the available bandwidth.
Given a fixed capacity VPN WAN link (say a T1), it is desirable to reserve bandwidth outbound (and inbound if possible) on a per user (remote access) or per remote LAN basis.There are, however, some questions about how bandwidth reservation can be accomplished over tunnels. For outbound reservation, the Security Gateway could implement transmit priority queues, but inbound reservation requires the assistance of the VPN carrier.
                                                TOP       

    Some possibilities for inbound reservation are:

  • The ISP POP access device could apply tunnel/non-tunnel 
    bandwidth reservation and filtering techniques to the 
    client's requirements.

  • The VPN carrier could offer an SVC-style service where 
    each VPN link has some predetermined capacity.

  • L2TP network servers or access concentrators have the 
    option of inbound, dynamic, flow control to help inbound 
    bandwidth reservation.

  • Remote VPN clients can be flow-controlled using L2TP 
    sequence numbers/window size in order to reserve 
    appropriate bandwidth for individual VPN clients and 
    non-VPN traffic. To be effective, the VPN carrier POP 
    would need to support at least a broad VPN/non-VPN queuing 
    priority inbound to the L2TP network server.

    It would be useful if bandwidth reservation could be 
    managed dynamically.

High-Performance Routing Issues
With encryption being used from intranet or host-to-host, the nature of IP-switching filters changes. For IP-switching (L3 switching) to function on encrypted data flows, it may need to understand the IPSec and L2TP standards. For example, the definition of a flow may need to make use of the IPSec protocol headers to identify a communication stream. As an example, it may be possible to trigger on the SPI field of the ESP header used in IPSec as a means of identifying a stream. For L3 switches that terminate secure tunnels, no fast forwarding is possible since the encrypted IP packet needs to be reconstituted before being forwarded. There is also the extra load of decrypting/encrypting for these secure tunnels. In time,encryption (and compression) will be present in all hosts and there will be less need for routers to terminate secure tunnels-allowing switching based on tunnel header information and requiring no encryption/decryption horsepower. Work to redefine the TOS field of IP packets as part of DiffServ may deliver the means to reinstate traffic prioritization in L3 switches for secure data flows.




Quality of Service                           TOP             

What Quality of Service can you expect from your VPN service provider and how can you measure what you are getting? Most data services, such as Frame Relay, provide guarantees for uptime and availability,
as well as throughput and response time. These guarantees, or Quality of Service (QoS) metrics, are defined in the Service Level Agreement (SLA) with your service provider.
While most managed VPN services provide a certain level of guaranteed uptime and availability, many do not provide comparable performance and latency guarantees, nor do they offer throughput guarantees. There are several different schemes used to provide Quality of Service, some of which have been developed specifically with a particular technology or protocol in mind, such as Ethernet or ATM. Other schemes are specific to the IP protocol and are being developed by the IETF. Examples of different QoS schemes are:

If you are considering a managed VPN service, you need to pay particular attention to the QoS metrics specified in the SLA from your service provider. If the service provider is unable to provide adequate SLA guarantees, you may need to reconsider how you deploy VPNs in your environment. Some applications, such as dial-up remote access, are very suited to the VPN approach as users are unaccustomed to guaranteed uptime and availability and are less demanding of the service. However, replacing dedicated leased line or Frame Relay connections between branch offices and central sites with an intranet VPN is unlikely to give the same levels of performance and QoS to users unless the service provider is able to give throughput and latency guarantees.


SLA Checklist                               TOP 
Here are some things to ask your service provider about SLAs:

SLAs In the Future
Over the long term, SLAs for VPN services are likely to improve as the various different QoS schemes are deployed more widely. However, until this time, SLAs may be limited to connections over a single service provider's network. To ensure end-to-end SLAs in the interim time, traffic should stay on the same network. If the connection goes across networks, a service provider has little control over the quality of the other provider's network. This situation is likely to remain until service providers reach agreement on SLA interworking.



VPN Futures                                 TOP   

VPNs are only just starting to be deployed. Once VPNs are in wide use, they provide the opportunity to integrate other types of communication such as multimedia and Voice over IP (VoIP).

The primary concern for VPNs will always be security. However, once VPN products are widely available, the focus will fall more and more on delivering quality of service (QoS) and class of service (CoS) over IP networks as part of a VPN. As voice and data services merge into one (voice over IP, IP fax), new network services are being developed to offer the QoS/CoS required for data, telephony and fax. (For more information about QoS see Quality of Service and SLAs.) As products develop to take advantage of this opportunity, all communication devices will become IP addressable, providing voice, fax, video and data to the desktop.All of these services can make use of VPN security protocols.

Name servers could become very useful for configuring and reconfiguring VPNs. If the routers in a complex intranet VPN network were to make use of name servers to locate peer routers, then these networks could be reconfigured simply by changing the name-to-address mapping. Work is in progress to extend the use of DNS servers to provide a secure (IP Security-based) mechanism for routers to find peer routers and clients to find servers.

Next Generation VPN Carriers
New VPN carriers are emerging to take advantage of the new markets, and traditional telecommunications providers see that the aggregation possible with routed networks makes good sense for remote access data, as it reduces the strain on long-haul dial services as well.

New 'last-mile' technologies like Digital Subscriber Loop (DSL) deliver a means for the phone companies to provide high bandwidth IP access over existing cabling (twisted-pair copper). Cable companies also offer the potential to deliver high bandwidth IP access over existing and new cable infrastructure. As the phone and cable companies become familiar with delivering IP services, these new last-mile technologies put them in a good position to acquire a significant share of the Internet access and VPN markets.

New providers are focussing on providing VPN services. A popular technique is to build an ATM or Frame Relay backbone and then offer VPN links with guarantees on throughput and latency to enable customers to outsource remote access, site-to-site and even interoffice fax and voice.These networks are well placed to offer everything from voice to site-to-site by making use of the quality of service options inherent in ATM and Frame Relay networks.

To offer global services to a VPN customer with global data needs, consortiums of VPN carriers are forming to offer a uniform service internationally. Many of these services are based on ATM and Frame Relay, although new IP based services are becoming available.



VPNs and Voice/Data Convergence
Figure 11 Companies today use different communications infrastructure to provide their voice, data and Internet connectivity needs. On the voice side, components include a PABX, key system or Centrex service with features such as voice mail and automated attendant. Computer Telephony Integration (CTI) applications may also be used to link voice capabilities with data applications. On the data side, LAN infrastructure is typically provided by a stackable or chassis based hub with multiple 10/100 Ethernet segments. WAN connectivity is typically provided by a router using leased lines or Frame Relay, with Internet connections for e-mail and web browsing provided via a separate firewall connection.    TOP

Companies that use a variety of data and voice services to meet their communication needs will find new alternatives becoming available that offer direct and indirect cost savings. New customer-premises routers are now appearing that act as both Security Gateways and Multimedia Gateways. These Multiservice Routers integrate a number of LAN and WAN capabilities such as hub and routing functions, and also support new applications such as Voice Over IP (VoIP), IP-fax, Internet access (browsing, publishing, e-mail, e-commerce) as well as VPN traffic over a single local-loop link to a service provider POP.

An initial investment in web access and web publishing may well be the starting point for a company that wishes to take advantage of VPN services. For the move from web publishing and e-mail to full e-commerce, companies may follow these steps:

 

 

TOP