One of the main issues with VoIP over 3G networks is that the number of possible simultaneous calls per cell is much lower today than the number of calls that can be transported over 3G networks in circuit switched mode. This is due to the fact that the radio interface has been optimized on every layer to squeeze through as many circuit switched voice calls as possible. VoIP calls on the other hand are transported over IP which makes it impossible to specifically adapt each layer of the air interface for the application as each protocol layer is independent from the one above and below.
Another disadvantage to transport voice over IP is it's requirement for real time data transmission. As voice data can be compressed quite well, the required bandwidth is quite small. In order to keep the delay acceptable a single IP packet only carries around 20 milliseconds of speech data. At this rate, the additional information generated by the air Interface, IP, UDP and RTP headers is almost the same as the actual voice data. This doubles the bandwidth required to transport a voice call over IP compared to transporting it over optimized circuit switched channels over the air interface.
As if this was not enough there is yet another problem that plagues VoIP over wireless: While most other IP applications benefit from retransmission of lost or damaged air interface frames, this is most unwelcome for VoIP as it's better to loose a couple of frames rather than to wait for the retransmission. As the lower layers are not application aware, however, it's not possible to carry voice and data of other applications over the same connection and treating them differently on the air interface.
HSDPA And Intelligent Scheduling Come To The Rescue
While I knew all this for some time now and was thus a bit pessimistic about mid-term success of VoIP over 3G and WiMAX networks, Harri Holma and Antti Toskala describe in their book about HSDPA, or 3.5G as sometimes called in the press, that VoIP capacity is not necessarily lower than 3G circuit switched capacity per cell. Compared to an average of around 64 simultaneous circuit switched calls per cell as referenced in their book, they present a study which results in at least equal or even higher VoIP capacity in an HSDPA enabled cell. So how's this possible with all the difficulties mentioned before? Here are the main principles they used for their calculations:
Due to the use higher order modulation for mobile stations with good reception conditions, better error coding and fast re-transmission, total capacity of an HSDPA cell is twice as high compared to a 3G UMTS only cell.
Use of AMR
Many VoIP implementations today use the G.711 codec for digital voice transmission which requires a bandwidth of 64 kbit/s. For HSDPA cell capacity, the authors used the AMR codec instead, which is also used for circuit switched wireless calls today, which only requires around 12 kbit/s to achieve the same voice quality.
Compressing IP headers of VoIP frames is absolutely essential for capacity. Thus the authors have assumed the use of Robust Header Compression (ROHC) for their simulation. This is quite realistic for the future as ROHC between the mobile station and the RNC is already in the 3GPP standards.
HSDPA packets have a transmission duration of 2 milliseconds. A 2ms packet, however, can hold several VoIP packets. To achieve the highest cell capacity the traffic scheduler has to hold enough packets destined for a user to fill up a full air interface frame before they are sent. While this increases the total VoIP capacity of the cell it also has the disadvantage to introduce unwanted speech delay. For their simulation the authors did not queue more than three VoIP packets for a single user. This introduces a maximum additional delay of 60 milliseconds.
Fast HSDPA retransmission
The retransmission problem for VoIP described above is reduced by HSDPA by it's fast retransmission scheme. A faulty packet can be retransmitted within 10 milliseconds. If air interface parameters are used to ensure that at most two retransmissions are required before the packet can be deciphered correctly on the other end, a maximum additional delay of 20 milliseconds can appear.
Based on the assumption that an additional latency of 80 milliseconds is acceptable to the user, the authors show that HSDPA network can have the same or even better voice capacity than 3G networks have today for circuit switched calls. It's still some way to go until we are at this point as enhancements have to be made on all parts of the network. But this study impressively guides the way forward!