Implementation of Streaming Control

4.6 Implementation of RTP Streaming

4.6.3 Streaming Control

4.6.3.1 Implementation of Streaming Control

In the following paragraphs, timestamps, timescales, pre-sent data, and computations of the start time and sleep time are described.

As a side note, without the loss of generality, the propagation delay and the media decompression time are neglected in the proposed system. In other words, it is assumed that as soon as the data are injected into the network by the server, the client can playback them.

Timestamps and Timescales

Different media files may have different sets of timestamps which advance at different rates, depending on the desired accuracy [2]. To be able to synchronize different streams, for example, an audio track and a video track in a presentation, different timestamps must be able to be converted from one to another, and into the normal clock (i.e. hours, minutes, seconds and milliseconds). This is the use of timescales. To illustrate the meaning of timescales, some examples are provided. For the frame data with the timestamp equal to half of the value of its corresponding timescale will have a presentation time equivalent to 500 milliseconds, while for the frame data with the timestamp equal to a quarter of the value of its corresponding timescale will have a presentation time equivalent to 250 milliseconds. For a typical example, in the proposed system, the timescale for the live-captured video is 1000 while the timescale for the

stored video is 90000. To synchronize the two, their least common multiple (i.e. 90000) can be used. Hence, the frame from the former with the timestamp equal to 500 will be played at the same time as the frame from the later with the timestamp equal to 4500, because for the former, 500 over 1000 can be converted into 4500 over 90000, which is the same as that of the later.

Pre-sent data

Since the network jitter and the roundtrip delay changes do exist in the current model of Internet, it is often desirable for the server to transmit data ahead of their actual presentation times so that even when the network congests for a short while, the client can still playback the media data smoothly. The size of the pre-sent data should be made to be smaller or equal to the size of the client buffer as not to overflow the client buffer. In fact, in the proposed system, data are transmitted one second ahead of their actual presentation times. For example, when the streaming process just begins, the frame data with presentation times from 0 to 1 second will be transmitted as fast as possible. After 40 milliseconds, for media with the framerate equal to 25 frames per second, the frame data with presentation time at 1 second and 40 milliseconds will be sent. After another 40 milliseconds, the frame data with presentation time at 1 second and 80 milliseconds will be sent. This process will be continued until the presentation ends. Therefore, it is essential that the system designer uses two variables to keep track of the timestamp of the last transmitted frame and the timestamp of the frame that the client may playback now. In this way, each time when the server looks at its stopwatch, it can figure out the timestamp of the frame that the client may playback currently by calculating the time difference between the current time and the start time of the streaming process, and then compare this timestamp with the timestamp of the last transmitted frame. If the difference between them is smaller than the value of the media’s timescale (i.e. the client playback buffer may not contain data that are worth one-second), it will transmit the next frame immediately; otherwise, it will sleep for a while.

Calculation of Start Time

During the lifetime of a streaming session, there are three major scenarios that should be addressed:

1. Normal presentation: the streaming process continues to run without any user intervention (PAUSE, REWIND, or FORWARD).

2. Paused presentation: the streaming process is paused and then resumed at the very same time instant of presentation.

3. Sought presentation: The streaming process is paused but then resume at a different time instant of presentation. In other words, the presentation of the media data was rewound or forwarded (sought).

In the first scenario, the server periodically reads its stopwatch, performs the calculation, and makes the transmission decision described above. In the second and the third scenarios, because the streaming process is paused for a period of time and then resumed, the time difference between the recorded start time and the current time cannot be used to calculate the correct timestamp of the frame that the client is playing;

consequently, the server cannot make the right transmission decision. This problem can be solved by shifting the original start time to a new time instant that is properly placed a certain distance away from the current time. The calculation for this distance is different for these two scenarios because for the former case, the pre-sent frames are still placed at the client’s buffer waiting to be played, while for the later case, the pre-sent frames become useless to the client since the presentation is sought (rewound or forwarded) to a new point of time.

For the second scenario, to work out the time distance away from the current time which the new start time should be placed, one has to realize that since the pre-sent frames are still useful to the client, the state of streaming process should be restored to the one just before pausing. In other word, this time distance, in fact, should be equal to the time difference between the old current time and the original start time just before pausing. Thus, it is not hard to imagine that the timestamp of the frame that the client should be playing now and the timestamp of the last transmitted frame remain the same as the ones just before pausing. Since all the conditions for the transmission decision

making remain, the result of it that would have been made just before pausing would be identical to the one now.

In the last scenario, since the pre-sent data are no longer useful to the client even though they still reside in the client’s buffer, the calculation to derive the new position of the start time is a little bit different. Instinctively, when the streaming process is resumed, the client is waiting for the media data that it rewound or forwarded to (data at the sought presentation time), so the server has to stream those data right away. To make this happen, the time difference between the current time and the new start time should be set to equal to the sought presentation time; equivalently, the new start time should be placed at a distance equal to the value of sought presentation time away from the current time. In addition, since the server has not transmitted any media data ahead of time, the timestamp of the last transmitted frame is equal to the sought presentation time. As a result, the server will not sleep for a few run and will desperately transmit the media data since it sees that the difference between the timestamp of the frame that the client may be playing now and the timestamp of the frame sent lastly is too small, and does not want to drain the client’s buffer.

Calculation of Sleep Time

As described previously, when the server reads its stopwatch and realizes that it is still too early to transmit the next part of the media data, the server can relinquish the CPU so other processes or threads can be up and running. So how long the server should sleep? If it sleeps too long, it will not be able to follow the schedule of the media file precisely. To the other extreme, if it only rests for a very short period of time, obviously the overhead required for the frequent context switches between processes or threads would degrade the overall system performance. The rule of thumb is that if it is x milliseconds earlier than the time that the next frame should be transmitted, the server should sleep for max(1 , x - c) milliseconds. c milliseconds fixed offset is needed here because if the context switch needed by the server to regain the CPU is delayed for some reason, the server will not be able to wake up on time. Therefore, it is essential to leave some safety margin.

4.6.3.2 Important Variables, Their Meanings, and Their Initial

在文檔中配備頻寬平順技術之RTP/RTSP即時互動式多媒體串流監控系統設計與實作 (頁 86-90)