Industry Video Teleconferencing Profile - VTC001

Table of Contents

5.1. Video, communications and control

5.2. Control and Indication signals

5.3. Audio

5.4. Confidentiality and secure operation

5.5. Multipoint Control Unit (MCU), 5.5.1. General, 5.5.2. Video, Communications and Control, 5.5.2.1. General, 5.5.2.2. Video Switching (Selective Presence), 5.5.2.2.1. Voice Activated Switching, 5.5.2.2.2. User Broadcast Control, 5.5.2.2.3. User Select Control, 5.5.2.2.4. Chair Control, 5.5.2.2.5. FEC Framing on Switching, 5.5.2.2.6. Terminal Identifiers, 5.5.2.3. Video Mixing (Continuous Presence), 5.5.2.4. Selection of SCM, 5.5.2.4.1. Minimum SCM, 5.5.2.4.2. Secondary VTUs, 5.5.3. Audio, 5.5.3.1. General, 5.5.3.2. Audio Mixing, 5.5.3.3. Voice Activated Switching, 5.5.4. Data Communications, 5.5.5. Confidentiality and Security, 5.5.6. Cascading, 5.5.7. Simultaneous Conference Operation, 5.5.8. Value Added Services

5.6. VTU Control of Multipoint Conference

5.5. Multipoint Control Unit (MCU)

5.5.1. General.

The MCU shall enable three or more VTU systems to participate in an audiovisual conference. Two or more MCUs can be cascaded to provide conferencing between additional VTUs or for network considerations (See 5.5.6). The MCU shall provide audio mixing and video switching capability as described in the following sections. This Profile defines the requirements for interactive multipoint video teleconferencing. Multipoint broadcast audiovisual transmission is outside the scope of this Profile.

In general, the MCU shall comply with the same requirements as the VTU. This includes FIPS PUB 178 and ITU-T Recommendations H.221, H.320, H.230, and H.242 except as noted in the following sections. In addition, the MCU shall comply with the requirements of ITU-T Recommendation H.231 which defines the functional representation of the MCU. The MCU and participating VTUs shall comply with ITU-T Recommendation H.243 which describes the detailed specifications and procedures for communications between two or more audiovisual terminals.

The various MCU functions and capabilities are enabled and disabled by transmission and reception of a set of digitally encoded commands. In the ITU Recommendations, each command is designated an acronym, typically three capital letters, such as VCF, which stands for Video Command Freeze-picture request.

Return to page Index

5.5.2. Video, Communications and Control

5.5.2.1. General.

In general, each port of the MCU must meet the provisions of section 5.1 of this Profile, unless otherwise indicated. The following are the applicable sections that must be met, replacing VTU with MCU in each section:

5.1.1. General. However, the requirements of H.261 do not have to be met, unless video mixing is used. See 5.5.2.3. Note that if FEC reframing is performed (see 5.5.2.2.5) the requirements of section 5.4 of Recommendation H.261 dealing with FEC Coding shall apply. The ability to switch video is mandatory. The MCU shall also comply with the requirements set forth in ITU-T H.231 and H.243.

5.1.2. Operating Mode. The MCU shall provide bi-directional point-to-point operation with three or more VTUs.

5.1.3. Data Transmission Rates. This Profile mandates p=1 and p=2.

5.1.11. Concurrent Operation.

5.2. Control and Indication Signals. Note that MCUs have a somewhat different set of C&I signals from the VTUs.

5.2.1. Call Control (Handshaking).

5.2.2. Frame Structure.

Return to page Index

5.5.2.2. Video Switching (Selective Presence).

In the video switching mode of multipoint, the video displayed at each VTU is the video from one other VTU. This is in contrast to Video Mixing (5.5.2.3) where the video from more than one source may be seen. Several methods are available for selecting whose video is seen by each VTU.

Return to page Index

5.5.2.2.1. Voice Activated Switching.

The ability of the MCU to conduct a conference using voice activation to determine which VTU's video to broadcast to the other VTUs is mandatory. See section 5.5.3.3. The video to send to the selected VTU is at the discretion of the MCU manufacturer. The previously selected video is a good candidate. Voice activated switching can be overridden by action of the chair VTU (VCB), or a user control VTU (VCS or MCV).

Return to page Index

5.5.2.2.2. User Broadcast Control.

The ability of the MCU to allow a user to broadcast its video to the other VTUs is mandatory. The MCU shall recognize and obey MCV and Cancel-MCV from the user VTU.

Multipoint Command Visualization-forcing (MCV) allows a VTU to request that an MCU broadcast its video to the other VTUs. Cancel-MCV returns the conference to voice activated switching mode. See ITU-T H.243 for a detailed description.

Return to page Index

5.5.2.2.3. User Select Control.

The ability of the MCU to allow a user to select the video that the user's VTU receives is optional. When this capability is provided in the MCU, the MCU shall recognize and, if there is no conflict with other modes, obey VCS and Cancel-VCS from the user VTU.

Video Command Select (VCS) allows a VTU to request that the MCU send the video of a specific VTU to it. Cancel-VCS returns the conference to voice activated switching mode. See ITU-T H.243 for a detailed description.

Return to page Index

5.5.2.2.4. Chair Control.

The ability of the MCU to conduct a Chair Control conference is optional. This is indicated by the signal Chair-control Indicate Capability (CIC).

An MCU having Chair Control capability shall provide a conference with the following capabilities:

a) Allow a VTU to display the terminal numbers of other VTUs. (TCU, TIN, TID, TIL, VIN)

b) Allow a Chair Control VTU to request the Chair. (CCA)

c) Allow a Chair Control VTU to release the Chair. (CIS)

d) Broadcast one VTU's video to all other VTUs as directed by the chair. (VCB)

e) Return the conference to voice activated switching mode as directed by the chair. (Cancel-VCB)

f) Drop a VTU from the conference. (CCD)

g) Drop the entire conference. (CCK)

When the chair VTU indicates which VTU's video should be seen by the other VTUs (VCB), the video seen by the chair selected VTU is at the discretion of the MCU manufacturer unless it is currently selected by VCS. The previously selected video is a good candidate.

A conference participant who wishes to speak during a chair control conference should request the floor from the conference chair. The conference participants action, e.g., pressing a floor request button on the VTU, will cause the request for the floor (TIF) to be sent from the VTU to the MCU. The TIF shall be relayed to the chair control VTU by the MCU. The chair control VTU will indicate to the conference chair that another VTU requests the floor. The action taken in response to the request is at the chair's discretion, possible actions could be:

1) Ignore the request.

2) Defer the request while handling a request for the floor from another VTU.

3) Turn over the floor to the requesting VTU by broadcasting the requesting VTU's video to all other VTUs (VCB) and assuring that the VTU's audio is distributed to all other VTUs either by audio mixing or audio switching.

The following feature is optional.

a) Request to see a specified VTU's video. In a chair control conference, this command provides a roam capability allowing the chairman (or instructor) to selectively view the conference participants while they view the video selected by a previous VCB command or voice activated selection. (VCS)

Return to page Index

5.5.2.2.5. FEC Framing on Switching.

The capability to do FEC re-framing is optional. When the source of the video signal is changed, due to any of the above procedures, video bit streams that are simply switched will cause a delay before a useful picture becomes available at the receiving VTU. Part of this delay is due to the fact that the FEC incorporated as part of H.261 must be reframed by the decoder. At low bit rates, this could take about half a second. This delay could be eliminated if the MCU performs FEC reframing. To perform FEC reframing, the MCU must always decode the incoming FEC framed video data and re-encode the selected video stream with its own FEC. This process occurs all the time, even when the video is not being switched. When the video source is switched, the FEC framing will not be lost. If this is done, the MCU must also be able to detect fill FEC frames, strip out the fill, and insert the fill in the outgoing bit stream, in order to keep the same bit rates.

Return to page Index

5.5.2.2.6. Terminal Identifiers.

An MCU may optionally provide enhanced identification of the VTUs by using Terminal ID. Terminal ID allows VTUs to be assigned alpha-numeric sequences such as names or locations, rather than arbitrary numbers. An example of the use of the Terminal ID would be that an MCU could merge the ID of the selected video source with the video so that the resulting video contains an alphanumeric overlay. This would allow all receiving VTUs to see the ID of the source of the video. Another example would be for the chair control terminal to request the terminal IDs from the MCU in order to present a list of participants to the chair. This would aid the chair in selecting the proper VTU for various chair control functions. The MCU requests the Terminal ID from a VTU using either TCI or TCS. The VTU responds with TII or IIS. A VTU may request the Terminal ID of another VTU using TCP. The MCU responds with TIP. TCS and IIS (MBE) is the recommended method.

Return to page Index

5.5.2.3. Video Mixing (Continuous Presence).

Video mixing involves spatially multiplexing the selected images into a single image in "split screen" format. This is an optional feature. It requires the decoding and encoding of the video code, and therefore requires meeting the requirements of H.261. The number of images that are mixed, the method of selection and control, and the video format used are left to the discretion of the manufacturer.

Standards for video mixing have not yet been defined. They will be added to this Profile when they are mature. While it may be possible to implement a video mixing scheme within the current standards, control of the scheme must be automatic or out-of-band since there is no facility in the current standards for the terminal to provide this control to the MCU.

Return to page Index

5.5.2.4. Selection of SCM.

The Selected Communication Mode (SCM) is the set of bit-rates, total, video, audio, and data, that the MCU attempts to maintain during the conference. In order to communicate with the MCU, the bit-rates must be common between all Primary VTUs, although different audio algorithms may be used if they have the same bit-rate.

The MCU shall determine the SCM for a conference. The SCM may also change during a conference as VTUs join or leave the conference. It is suggested that the user fully understand the impact that the SCM selection method provided by a vendor may have on conference operation. For example, if the user expects operation at 384 kbit/s using G.722 audio then he should make sure the SCM can support that capability. The following methods may be used to determine the SCM. Other methods are possible.

a) The SCM is fixed as a permanent feature of the MCU.

b) The SCM is determined automatically by the MCU from the capabilities of the connected VTUs.

c) Several SCMs are provided. One is selected by the MCU service provider at the time the conference is setup.

d) The SCM is determined using procedures defined in MLP (T.120).

Return to page Index

5.5.2.4.1. Minimum SCM

The SCM determination method must include those modes that will enable at least minimum interoperability with VTUs having only the mandatory capabilities. This would be p=2, 56 kbit/s audio, 68.8 kbit/s or 70.4 kbit/s video, and 0 kbit/s data for unrestricted VTUs; and p=2, 48 kbit/s audio, 60.8 kbit/s or 62.4 kbit/s video, and 0 kbit/s data for restricted VTUs.

Return to page Index

5.5.2.4.2. Secondary VTUs.

In determining the SCM, the MCU may determine that many VTUs have a common capability set that is greater (more capable) than the remaining VTUs. The former VTUs are called Primary VTUs, while the latter are called Secondary VTUs. An optional capability is that the MCU can allow these Secondary VTUs to participate in the conference, but with a limited functionality. For example, a VTU on a network that can carry only p=1, might participate in a conference in which all other VTUs have video, but it does not. Without this optional capability, the Secondary VTUs would be dropped from the conference. The method of selection of the primary and secondary VTUs is left to the discretion of the manufacturer.

Return to page Index

5.5.3. Audio.

5.5.3.1. General.

The MCU shall meet the requirements of sections 5.3.2.1, 5.3.2.2, 5.3.2.3, and 5.3.3 of this Profile. These sections state that G.722, and G.728 are optional, however it is highly recommended that they be included.

The MCU shall have both G.711 A-law and m-law audio capability. This permits conferences with European VTUs which might have only A-law audio.

Return to page Index

5.5.3.2. Audio Mixing.

Audio mixing shall be the default mode of operation of the MCU. Audio mixing shall be accomplished by the summation of the linear (PCM or analog) audio signals received. In general, all the received audio signals are summed, but small signals may be suppressed in order to minimize interference in large conferences. The actual method is left to the discretion of the manufacturer.

Audio switching connects the audio from only one VTU to the other VTUs. In this case, audio signals from the other VTUs are not mixed. Audio switching may be desirable in some applications such as remote training where spurious sound from the non-speaking sites is unwanted. Audio switching may also be used to connect VTUs in private conversations. The control for audio switching may follow the results of video switching commands, such as VCB, or it may be out of band.

Because the audio must be decoded and recoded, and video is switched, there may be more delay in the audio channel than in the video channel. While delay compensation is not required, a delay in the video channel is allowable to maintain audio and video synchronization. The time delay between audio and video signals shall be measured as specified in Annex C of H.261.

Return to page Index

5.5.3.3. Voice Activated Switching.

The MCU shall analyze the audio inputs to determine which participant will have the floor next. The algorithm for this determination is up to the discretion of the manufacturer. The result of this algorithm shall be used to determine which video signal to transmit to each VTU or MCU in the absence of VCB, VCS or MCV. The video to be sent to the VTU having the floor is up to the discretion of the manufacturer. The previously selected video is a good candidate.

Return to page Index

5.5.4. Data Communications.

At this time, the ITU-T Study Group 8 is nearing completion of the T.120 Series of Recommendations. They are not yet implemented in commercial products. It is anticipated that T.120 applications will appear in the next one to three years. Until these become available, proprietary or out of band data transfer techniques may be used.

The MCU may optionally support data communications using the Low Speed Data channel, High Speed Data channel, Low Speed MLP channel, and/or the High Speed MLP channel as defined in H.221. The MLP data channels may contain information utilizing the Transmission Protocols for Multimedia Data defined in the ITU-T T.120 Series of Recommendations. The T.120 series not only includes data communications protocols and procedures, but also includes optional applications such as still image transfer, annotation, pointing, binary file transfer, and conference control.

In order for VTUs having T.120 capability to interact with each other in a multipoint conference, the MCU must follow the procedures defined in H.243 for opening and closing MLP data channels, and also be T.120 capable.

Return to page Index

5.5.5. Confidentiality and Security.

As an option, the MCU may provide confidentiality or secure operation. When required, confidentiality shall be provided as described in 5.4. Security for classified information shall be provided as described in B.5.4.4.

Return to page Index

5.5.6. Cascading.

The ability of an MCU to participate in a conference involving more than one MCU is optional and is called cascading. There are two optional types of cascading, Simple and Principal/Satellite. If the maximum number of MCUs to be connected is two, the Simple cascading capability is all that is needed. If three or more MCUs need to be connected, then Principal/Satellite cascading is required, but note that the Principal/Satellite method will work with just two MCUs.

The maximum number of MCUs between any two VTUs shall not exceed three. For a star configuration, the Principal MCU shall be designated before the call as the MCU at the center of the star. In Principal/Satellite cascading the Principal MCU shall transmit the MIN command to the Satellite MCU. In the case of contention for Principal designation, the RAN command may also be used as in the contention resolution procedure in ITU-T H.243. The RAN command is mandatory for MCUs that do not support administration of Principal/Secondary status, or where the customer does not wish to make use of the administration of Principal/Secondary status feature.

Return to page Index

5.5.7. Simultaneous Conference Operation.

An MCU may be used in more than one conference at a time. This is also known as segmentable operation. The number of simultaneous conferences that can be held is not a matter for standardization, but may be specified in the procurement document.

A Classified MCU shall have special requirements imposed in order to support multiple simultaneous independent classified conferences. See B.5.4.4.1.

Return to page Index

5.5.8. Value Added Services.

An MCU may optionally offer value added services that are not within the scope of the ITU-T H.320 Recommendations. Some of these services may be activated by the VTU using SBE characters. Value added services offer additional capability to the conference that are accessed by the VTU. These services might include conference access codes (passwords), request an operator, access the reservation system, add another party, etc. These services would be accessed by character sequences such as #O (# and zero on the keypad) for the conference operator. The appropriate character sequences may be obtained by audio prompt or other means. These character sequences are currently not standardized. Other value added services are also possible.

Return to page Index, Previous Section, Next Section