Point-of-Capture Archiving and Editing of Personal Experiences from a Mobile Device
4. SENSOR-ASSISTED VIDEO PROCESSING TECHNIQUES
From the interview with the participants in the pilot study, we also found two user requirements on a mobile editing tool. The first requirement is that as the number of content recordings increase, the organization of content needs to be in terms of the recording locations rather than in terms of the recording times, because of the reported user preference for this. There were many occasions where the users had to remove many blurry frames, thus, the second requirement is that there needs to be a mechanism for automatic removal of blurry frames caused by hand-induced camera shaking.
Solutions to address these two requirements can be found in many traditional image processing and content analysis techniques [26][21] that at the time of content production, extract metadata context information such as: location, objects, amount of shaking, and lighting levels, etc. Due to the limited computing capability on a mobile device, those computationally intensive image/video analysis techniques are not adequate for mobile device authoring. We believe in the context-aware media capturing approach where sensors are deployed at the point-of-capture to assist the capture and inference of a variety of context metadata. This sensor-assisted approach, in general, requires less computation; therefore, it is ideal for a resource-poor mobile device.
To meet the user requirements, mProducer incorporates two sensors to automatically create contextual metadata at the point of capture: (1) a global positioning system (GPS) receiver which detects location meta-data, and (2) a tilt sensor which measures the amount of camera shaking. The location meta-data for each content recording is used in the location-based content management, so that a user can easily navigate the map to locate a previously recorded content. The camera shaking measurements are used to detect the excessive amount of camera shaking, which results in blurry, unusable frames to be cut automatically. Figure 5 shows the hardware components of the prototyped PDA system together with a GPS receiver and a tilt sensor.
Figure 5: The HP iPAQ 5550 with camera, GPS receiver, and a Tilt sensor
GPS Receiver: It is the GPS-CF card from CHIPCOM Electronics. Each time a user records a video clip, mProducer will probe the GPS receiver for current location information. Then, this clip will be annotated with location information. Location information of each clip is used to construct the location-based content management map (described in Section 3). For the smart phone version of mProducer, we use the Bluetooth-GPS receiver from Pretec Electronics Corporation.
Tilt Sensor: It is the TiltControl CF card made by ECER Technology shown in Figure 5. It contains an accelerometer that measures the horizontal and vertical tilt of the device. Changes in the tilt are used to compute the magnitude of camera shaking and predict its impact on video quality. The tilt sensor measures both the direction and the magnitude of tilt angles. We elaborate on how to use tilt sensor for camera shaking detection in the following subsection.
Experiment to Identify Camera Shaking Pattern
We use a tilt sensor to measure the level of camera shaking and automate the process of shaking artifact detection and removal.
This is an ideal alternative to computationally intensive video analysis on a resource-poor mobile device. To determine the signature of camera shaking, we have conducted an experiment to distinguish between excessive amount of shaking (e.g., resulting from putting the device in a pocket during walking) from moderate shaking that comes naturally with unstable hands when walking while filming. Our experiment is described below.
Data Acquisition: The TiltCONTROL sensor monitors the vertical and horizontal tilt of the device throughout the experiment.
A series of readings are recorded and analyzed to determine if camera shaking occurs. The sample rate of tilt sensing is set to be
200 milliseconds. The standard deviation of the changes in the device angles is computed for each sliding window of the most recent 10 readings.
Shaking Detection: Device shaking can be detected when changes in a device’s tilt angles create oscillations between two opposite directions. The intensity of shaking can be measured by calculating the rate of change in device tilt angles and the oscillation rate. Walking while holding the device will create oscillations of smaller magnitude (see the middle graph of Figure 6; X-axis represents time, Y-axis represents the magnitude of change of degree per unit time). Walking with the device in a pocket will also create oscillations, but of a larger magnitude (see the right graph of Figure 6). For the experimental setup, we measured three activities for each participant: (1) holding the mobile device while sitting or standing still for 2 minutes (collecting 591 samples), (2) holding the mobile device while walking for 2 minutes (collecting 591 samples), and (3) putting the PDA in a pocket or a bag while walking for 2 minutes (collecting 591 samples).
Result: Based on empirical data shown in Table 2, we have determined two conditions for excessive shaking: (1) the standard deviation of the tilt angles is larger than 20°(degrees) , calculated by 89.9% of actual shaking frames (externally observed) having higher standard deviation values than this threshold value, and (2) the frequency of oscillations in both directions exceeds 1.5 oscillations per second, again, calculated by 76.5% of actual shaking frames having higher value than this threshold value. In Figure 6, we depict a partial result of one participant’s experiment. We can see from this figure, under the normal case, that the standard deviation is small, and the vibration is moderate. Walking introduces constant vibration, but the standard deviation is below 20°. When shaking, we can see that the standard deviation is high and the vibration is frequent. This pattern helps the system to detect camera shaking with a simple computation of standard deviation, this demonstrates how sensor measurements may assist in processing video content using simple computation.
Figure 6. Measured Oscillation Magnitudes for Three Activities
Table 2. Oscillation Measurements for Three Activities of Sitting, Walking, and Pocketing
Activities Standard deviation on tilt angle degree changes
Frequency of oscillations (per second)
Horizontal Vertical Horizontal Vertical
Sitting 2.62 3.00 1.36 0.76
Walking 5.27 7.13 1.89 1.97
Pocketing 64.72 75.96 1.73 1.85