Texas InstrumentsSemiconductors - Digital Signal Processing Solutions

Video Restoration on a Multiple TMS320C40 System

Man-Nang CHONG

Showbhik Kalra

Dilip Krishnan

Email: asmnchong@ntuvax.ntu.ac.sg

School of Applied Science

Nanyang Technological University

Singapore 639798

Literature Number

August 1996

ABSTRACT


This application report describes a parallel video restoration system to restore old motion picture archives. A Gaussian Weighted, Bi-directional 3D Auto-Regressive (B3D-AR) algorithm is used to alleviate the presence of noise in the old archives. Common forms of degradation found in such archives are "dirt and sparkle" and scratches. The distortion is caused either by the accumulation of dirt or by the film material being abraded.

While most of the existing image restoration algorithms will blur edges of moving objects in the vicinity of occluded and uncovered image regions, this algorithm is able to suppress mixed noise processes and recover lost signals in both the covered and uncovered regions in an image sequence. This video restoration system is tested on the artificially corrupted image sequences and naturally degraded video (full PAL image size). Samples of the original and corresponding restored image sequence are contained in this report.

The B3D-AR algorithm is parallel implemented on an array of 15 Texas Instruments TMS320C40 processors connected in a tree configuration. Two different parallel algorithms are implemented in which a close to linear speed-up is achieved by means of a load-balanced parallel algorithm.

INTRODUCTION


While many old movies are recorded on flammable nitrate-based negatives that decay rapidly, modern movies are made with safer acetate-based 35mm films. However, both types of media are susceptible to degradation such as gouges, scratches and the accumulation of dirt. The result is a variety of artefacts that make the old movies look their age.

The deterioration in old movies can be stopped by adopting digital film archiving technology, but defects that are already present in the films will be inherited into the digital storage. Restoration of degraded motion picture is a highly labour-intensive and extremely costly undertaking. A much publicised event [1] is the restoration work of Disney's 1937 masterpiece -- Snow White and the Seven Dwarfs, which was re-released in digital form in 1993. It would be financially rewarding to reproduce the old movies with as much fidelity to the original negatives as possible so that the movies can be re-released in higher quality formats such as video-on-demand, digital video-disk, and HDTV. Therefore, a video restoration system which can automatically remove artefacts in film archives will be of useful to the entertainment and broadcast industry.

This application report describes an Auto-Regressive (AR) model-based restoration algorithm and its parallel implementation on a network of Texas Instruments TMS320C40 processors. The restoration process begins with the conversion of the degraded film into its digital form with the aid of a real-time video digitiser. The success of automatic video restoration relies on the fact that image frames in a movie do not change significantly from one frame to the next, except for changes due to moving objects in the scene. This means the frames preceding and succeeding the current image frame will provide enough repeated information to allow us to detect the presence of degraded regions in the image. This same redundancy provides us a way to mathematically model the image region at the vicinity of these artefacts so that meaningful information may be used to fill in the corrupted image regions, resulting in a restored image frame. The scene changes due to moving objects and uncovered background signals must be identified to yield an accurate model. To account for the inter-frame changes that are caused by moving objects in the scene, the motion of these objects is first computed by a Motion Estimation algorithm. Once the moving regions have been compensated for, a 3-Dimensional Auto-regressive (3D-AR) model is built from the information contained in both the preceding and succeeding frames. Two of these dimensions describe changes within the image frame and the third describes changes between frames. Restoration is done one frame at a time so that the restored frame could then be used to help restore the subsequent corrupted frames in the image sequence. The proposed AR model-based approach has the important advantage over the global filtering strategy [2] which tends to blur sharp edges or homogenise highly textual regions in both the distorted and uncorrupted image regions. Statistical approaches such as Markov Random Field Modelling [3] have produced good results albeit at a higher computational cost.

The computational demands required to estimate the motion of moving objects in the image and formulate the 3D image sequence model is still huge. The time required to restored just a single PAL frame using one workstation can run up to a few hundreds of seconds. Timely restoration can only be achieved through the design of fast (computationally efficient) algorithm and the use of parallel processing techniques on a network of Digital Signal Processors (DSP).

In this project, we describe in detail a fast video restoration system where distorted old archives can be digitised, restored, and transferred to new storage media with minimal human supervision.

OVERVIEW OF VIDEO RESTORATION ALGORITHM


The schematic diagram of a video restoration algorithm is shown in Figure 1.


Figure 1 Video Restoration Schematic Diagram

BI-DIRECTIONAL MOTION ESTIMATION [4]


The image is first partitioned into blocks of E E pixels for the computation of the motion vectors. The motion is first estimated by some motion estimation algorithms and then processing is directed along the calculated motion trajectories. A robust motion estimation algorithm is necessary for restoration of the image sequences, and it is noted that motion estimation is a vibrant research field. In this application report, the motion estimation algorithm used here is a robust Overlapped Block Matching (OBM) algorithm as shown in Figure 2. In order to get a reliable and accurate displacement estimate, the size of blocks for Block-Matching has to be chosen carefully. Since the image sequences are bound to be degraded, the estimate will be unreliable and affected by noise if small blocks are used. On the other hand if large blocks were to be used, the estimate becomes inaccurate as the displacement vector field inside the large blocks would not be constant. Therefore small blocks are required to estimate the displacement vector field sufficiently local and adaptive. The proposed OBM scheme attempts to circumvent the above mentioned problems when estimating the motion vectors for each frame of the image.

First, the whole frame is divided into blocks of E E pixels. Each block of E E pixels will require to search for a forward motion vector (FMV) from a past reference frame (temporal index, t = -1) and a backward motion vector (BMV) from a future frame (temporal index, t = +1). For each block of E E pixels, one motion vector is selected from the pair of FMV and BMV; the motion vector that yields a smaller sum of absolute error is selected. The OBM scheme is used to estimate the motion vectors. As shown in Figure 2, the block matching is done with the overlapped blocks of D D pixels in the current frame where D > E and all E E pixels blocks are centred within the D D pixels blocks. This D D pixels block is compared with a corresponding block within a search area of size (D+2P) (D+2P) pixels in the previous frame, and the best match is found based on the minimum absolute error (MAE) cross-correlation[5]. The motion vectors found by comparing the D D pixels block in the present frame and the (D+2P) (D+2P) pixels block in the previous frame are then assigned to the E E pixels block. The search procedure adopted in the proposed OBM scheme is based on a threshold exhaustive search[5].


Figure 2. Overlapped Block-Matching Motion Estimation Algorithm

BI-DIRECTIONAL 3-DIMENSIONAL AUTO-REGRESSIVE (B3D-AR) MODEL [4]


The 3-Dimensional Auto-Regressive (3D-AR) model[6] has been successfully applied to remove impulsive noise and other types of degradation in image sequences albeit at a higher computational cost in the interpolation process. Kokaram's 3D-AR model[6] was modified to the Bi-directional 3D Auto-Regressive (B3D-AR) model[4] where the computational cost in the interpolation process is reduced significantly. In this application report, B3D-AR as described by equation (1) is used for the detection of corrupted pixels.

(1)

where:

= Predicted pixel intensity at the location (i,j) in the nth frame

ak = Auto-Regressive (AR) model coefficients

N = Total number of AR model coefficients

[qik , qjk , qtk] = Offset vector that points to each pixel neighbourhood used for the AR model, as shown in Figure 3. The component of the offset vector which determines the temporal direction of the supporting pixel is qtk and its value is -1 for a support pixel in the preceding frame and +1 for a support pixel in the succeeding frame. Therefore is the pixel intensity at the kth support position for the pixel at (i,j,n).

= displacement vector between frame nth and frame mth. (i,j) denotes that the displacement is a function of the position in the image.

For parameter estimation, the task is to choose the parameters in such a manner to minimise some function of the prediction error , as shown in the following equation (2) :

(i,j,n) = I(i,j,n) - Î(i,j,n) (2)

They are two sets of parameters to compute (estimate): the model coefficients and the displacement vectors. The motion vectors are to be computed first using a Motion Estimation Algorithm. Subsequently, the Least Mean Square (LMS) approach is used to compute the model coefficients.

The coefficients chosen to minimise the square of the error in equation (2) leads to the normal equations:

Ra = -r (3)

Where R is a N by N matrix of correlation coefficients, a is the vector of model coefficients and r is a N x 1 vector of correlation coefficients. The solution to equation (3) yields the model coefficients [5].

In our implementation, as shown in Figure 3, each block of 16x16 pixels in the current frame nth is modelled with a set of 9 AR coefficients. The predicted intensity of a pixel within the 16x16 block in frame nth is calculated from its corresponding motion compensated 3x3 support region in either the previous or next frame.




Figure 3 : The support region selected is based on the value of t obtained from the bi-directional motion estimator

DETECTING THE DISTORTIONS IN IMAGE SEQUENCE


The position of a local distortion can be detected by applying some threshold to , the square of the error between the actual and predicted intensity of the pixel at location (i,j,n) which is given by:

(4)

where the predicted intensity , given in equation (1) is calculated from the AR coefficients estimated in equation (3).

REMOVING THE DISTORTIONS IN IMAGE SEQUENCE


The restoration process can be seen as a threefold process. First, the pixels which are detected as "distorted" pixels are weighted according to a Gaussian Weighting scheme. Second, a set of newly estimated unbiased AR coefficients are re-computed using the equation (3). Finally, the "distorted" pixels identified by using equation (4), are then removed by substituting them with the value of calculated with the new set of AR coefficients.

To restore the "distorted" pixels, the re-computed model coefficients are required. As shown in Figure 3, the support region for each predicted pixel has a size of 3 x 3 pixels only. Thus, the Normal equation (equation 3) must be altered to solve for the AR coefficients using the Gaussian Weighted coefficients estimation. Normally, the model coefficients chosen are such as to minimise the expected value of the squared error at the concerned point. Once detection of dirt has been done, some of this data is known to be degraded. Therefore the prediction error at these points may be weighted by a function , so that these degraded portions do not affect the estimation process.

The new weighted error equation may be written as :

(5)

The Gaussian weighting function, is assigned to each degraded point depending on the magnitude of the error, at location (i,j) in the nth frame, during the re-computation of the video model. The rest of the symbols have their usual meaning as presented in equation (1) and a0 = 1. The Gaussian weighting function can be described as follows (6):

for

for

for (6)

The square of the new Gaussian weighted error equation (3) is minimised with respect to the coefficients and yield a normal equation similar to equation (3).

DETAILS OF THE RESTORATION PROCESS


To restore a block of B B pixels centred within a block of size M M pixels in the current frame. The M M block's motion estimate in the previous or next frames must be determined. The choice of the previous or next frames is decided by the B3D-AR model as discussed earlier. Then, using these two blocks of pixels, a set of coefficients ak are derived by the normal equation. It is assumed that the information within a block of size M M is stationary enough to enable the use of one model, i.e. one set of coefficients for all the M2 pixels within the block. The model is applied to the B B block and pixels identified as "noise" are restored.

The support region used for prediction can be represented as x:y. A 9:8 support region means that the support region consists of 9 pixels from the previous frame and 8 from the current frame. We have implemented this model considering information only from the previous or next frame. In other words, we employ the 9:0 or 0:9 model. Each pixel in the current frame is thus modelled by 9 pixels in the previous or next frame. A support region wholly in the previous/next frame is unlikely to be affected by noise around the same relative areas as in the current frame (noise is essentially temporally isolated). The use of the 9:0 support region ensures that the current frame information is not used for detection and cleaning.

The "noisy" pixels are now interpolated with their predicted values after re-calculating the model coefficients. The interpolation equation now used is :

iu = ik Ak (7)

The two vectors iu and ik, represent the known and unknown (noisy) pixel intensities, respectively. Ak is the matrix of coefficients a. The structure of the two matrices ik and Ak has been modified to set up the equation (7) for a simple and computationally efficient solution. Matrix ik is of size u N, and Ak of size N 1, where u is the total number of unknown (noisy) pixels in the B B block and N is the number of model coefficients ak. The above solution consists of N u operations, which is O(u), since N is fixed. The cleaning process is now dependent on the level of noise present in the B B block. Thus, a less noisy block takes less time to restore than one with more noise, since less computations are involved.

PARALLEL IMPLEMENTATION OF THE VIDEO RESTORATION ALGORITHM ON A NETWORK OF TMS320C40 PROCESSORS


The parallelism inherent in image restoration is geometric parallelism. Each frame in the image can be partitioned into independent sub-blocks. These sub-blocks are then distributed among the worker (also known as slave) processors by a master (root) processor. Each of these blocks will undergo the same restoration operations. Since a master processor distributes different data packets to the slave processors, each of which performs the same sets of operations on it, the parallel machine can be said to be employing the SPMD (Single Program Multiple Data) paradigm.

The parallel implementation of the B3D-AR model is carried out on a network of fifteen TMS320C40 digital signal processors. Each TMS320C40 has 8 Mbytes DRAM while the root processor has 32 Mbytes DRAM. The processors are connected in a tree configuration. This particular configuration was chosen as it strikes the right balance between efficiency and algorithm simplicity. The architecture of the TMS320C40 also limits the maximum number of possible connections to each processor to 6. The logical configuration is shown in Figure 6.


Figure 6. Logical Arrangement of Tasks

There are two entities involved in our system, tasks and processors. The tasks represent the logical configuration of the system, while the processors represent the physical configuration. The physical configuration of the system is decided by the underlying hardware layout, while the logical configuration is decided by the parallel algorithm used. Figure 6 shows the logical layout. There are three different tasks : master, sub-master and worker. M represents the master task; SM1-SM4 are the 4 sub-master tasks; W1-W14 are the worker tasks. The master task resides on the root processor (first level processor) which also communicates with the host SUN SPARC10 workstation.

A single processor may have more than one task running concurrently on it. On the second level of the tree configuration, there are two tasks, namely sub-master and worker tasks running on the four processors. The dashed arrows depicted in Figure 6 show a logical (non-physical) channel that communicates between the sub-master and worker tasks within a processor. Thus, 10 of the 14 worker tasks are dedicated i.e. the processors they are designated to perform only the processing job. The 4 remaining worker tasks are non-dedicated: they are placed on processors which perform both the distribution and processing jobs. It is obvious that the performance of the non-dedicated worker tasks will be lower due to the additional distribution workload on them. The master task M distributes packets of work to the sub-master tasks and in turn distribute it to the workers.

"C" programming language is used in implementing the restoration algorithm and compiled using the 3L Parallel C compiler [7].

LOAD BALANCING


Load balancing of the entire workload is the most important consideration in case of parallel algorithms. We have employed the RILB (Receiver Initiated Load Balancing) technique[8]. This scheme is characterised by the fact that the distribution of work is performed only when an idle task requests for work. The request for work may be explicit i.e. the passing of a message requesting for work, or implicit i.e. the task finishes processing the work packet assigned to it and passes back the results. We use implicit requesting, since the processed (cleaned) block of data must eventually be sent back up to the master for re-combination with the rest of the processed image.

Each workload (packet) consists of a 16x16 pixels block in the current frame as well as its search space in the previous and next frames. The sub-masters serve as work distributors. Initially, they send out work packets to all worker under them. The worker receive this work packet, perform motion estimation, detection of the noise and restore the corrupted pixels. The workers then pass back the pixel positions of the noise and their new 'clean' intensity values for the final image tiling at the master. This is also a signal indicating that the workers have finished with their assigned task. Whenever a sub-master receives such a signal from any worker, it will relay the signal upwards to the master. The sub-master will then receive the next work packet from the master.

The size of each work packet must be small enough to ensure that while the master is distributing work packets, no worker has to wait for too long. Performance degradation would set in if a processor had to wait for long.

RESULTS AND DISCUSSIONS


The proposed algorithm is evaluated by applying different image sequences containing different noise processes:

(1) uncovered-background region in an image sequence which is artificially corrupted by single to multiple pixels sized impulses (Salesman Sequence).

(2) occluded region in an image sequence which is artificially corrupted by single to multiple pixels sized impulses (Salesman Sequence).

(3) a sequence that undergoes translational motion and is artificially corrupted by single to multiple pixels sized impulses (Salesman Sequence).

(4) area which undergoes zooming process and is artificially corrupted by single to multiple pixels sized impulses (Corridor Sequence).

(5) real degraded image sequence (Frankenstein Sequence)

All the artificially added noise is temporally isolated which is usually the case in real degraded motion picture [9].

Figure 7 shows an artificially corrupted frame in the Salesman sequence. The blotches and scratch line are temporally isolated. The proposed algorithm (BAR3D) was applied to the Salesman sequence that contain regions undergoing self-occlusion and the Corridor sequence which consist of a scene undergoing zooming. Multiple pixel-sized blotches and artificial scratches were synthetically added to several frames of each sequence. The picture quality can be seen from Figures 8 which shows the corresponding restored frame using B3D-AR model.

Figure 9a shows a magnified portion of the degraded Salesman frame. Figure 9b shows the corresponding restored frames using the BAR3D models. Figure 10a shows a degraded Corridor frame. The Corridor sequence exhibits a motion called zooming. Figure 10b shows the corresponding restored frames using the BAR3D model.


Figure 7. A corrupted frame of the 'salesman' sequence with blotches varying from sizes of 22 to 44 and a line of width 2.


Figure 8. Restored frame using the bi-directional 3D-AR model



(a) (b)

Figure 9. (a) A magnified portion of the corrupted 'salesman' frame at the region of self-occlusion (b) the corresponding restored frame using the bi-directional 3D-AR model.

(a) (b)

Figure 10. (a) a corrupted frame in the 'corridor' sequence. (b) the corresponding restored frame using bi-directional 3D-AR model.

The restoration quality of the algorithm on naturally degraded image sequence (obtained by digitising the Frankenstein video) are shown in Figures 11a, 11b, 12a and 12b. The video was first digitised from a PAL format video tape before applying the algorithm onto the image sequences. The size of each frame in the PAL image sequence is 576 720. The original Frankenstein sequence is heavily blotched and has been effectively restored by the Gaussian Weighted, B3D-AR algorithm.


Figure 11a: Sample A -- a selected frame of a noise-corrupted image sequence


Figure 11b:The corresponding restored frame (Sample A) using bi-directional 3D-AR model.

Figure 12a: Sample B -- a selected frame of a noise-corrupted image sequence

Figure 12b: The corresponding restored frame (sample B) using bi-directional 3D-AR model.

PROCESSING PERFORMANCE OF THE PARALLEL VIDEO RESTORATION SYSTEM


The 2nd level processors (as shown in Figure 6) actually execute two different tasks: sub-master and worker tasks. These tasks run concurrently. This means that if the distribution of the workload exceeds the processing of the workload in the sub-master processor, drastic performance degradation could take place.

We implemented two algorithms on these four 2nd-level processors to measure the effect of the distribution work-load on their performance. The first algorithm (Algorithm A) consisted of a simple mechanism where no distinction was made between the concurrently executing tasks. The sub-master processor divides time slices equally between the two tasks.

The second algorithm (Algorithm B) followed from a careful analysis of the sub-master and worker processing burdens. It was found that the computational load of the worker exceeded that of the sub-master by a substantial margin. It is thus not justified for the two tasks to receive the same share of processing time. The sub-master task needs to be active only when a worker under it has completed a work-task and requires a fresh work packet. Thus, the worker should receive a larger share of processor time. This was achieved by using priority[7]. The worker tasks were given the highest priority. The sub-master task was accorded a low priority, becoming active only when required. Otherwise, it remained descheduled. This is in contrast to Algorithm A, where the distributor task is constantly running without doing any useful processing.

Figure 13 shows the speedup characteristics of algorithms A and B.


Figure 13. Speed-up characteristics of the two algorithms

The improvements in the results for Algorithm B are evident. Up to a network of 4 processors, the two algorithms display the same speed-up. This is due to the fact that only the 4 first-level processors are used as dedicated workers only. When the tree configured network grows into the 3rd level, the degradation of the performance of Algorithm A starts to take place. Algorithm B, on the other hand, does not degrade as much. This clearly demonstrates the precedence that the worker tasks should take over the distributor tasks. It is also observed that the performance of algorithm B is well above the P/log P speed-up that is usually accepted as good performance for a parallel algorithm.

CONCLUSION


The video restoration algorithm and its implementation on a network of 15 TMS320C40s are presented in this application report. It is shown to have better restoration quality when tested on a set of image sequences. The results and analysis show that B3D-AR model is capable of restoring noise corrupted video. . While, the Gaussian Weighting scheme provides good spatial support, the bi-directional scheme prevents the progressive degradation of image sequences due to the corruption in regions exhibiting different motion processes, such as occlusion, zooming, rotation and panning. The video restoration has been tested on different image sequences containing different noise processes such as variable-size blotches and line scratches. Its effectiveness in the restoration of these different noise artefacts has been demonstrated. More importantly, when the system is applied to a natural degraded (PAL-size) video, the noise level of the image sequence is significantly reduced while retaining the crisp and sharpness of the original image sequence.

Parallel implementation of the proposed algorithm is realised where close to linear speed-up is achieved on a 15-node TMS320C40 system hosted by a SUN SPARC10 workstation.

REFERENCES

[1] B. Fisher, "Digital Restoration of Snow White : 120,000 Famous Frames are Back", Advanced Imaging, pp. 32-36, September 1993.

[2] G. R. Arce, "Multistage Order Statistic Filters for Image Sequence Processing", IEEE Transactions on Signal Processing, 39(5), pp. 1146-1163, May 1991.

[3] R. D. Morris, "Image Sequence Restoration using Gibbs Distributions", PhD. Thesis, Department of Engineering, University of Cambridge, U.K., May 1995.

[4] W. B. Goh, M. N. Chong, S. Kalra, D. Krishnan, "A Bi-directional 3D AR model approach to motion picture restoration", IEEE Int. Conf. On Acoustics, Speech & Signal Processing, pp. 2277-2280, May 1996.

[5] A. K. Jain, " Fundamentals of Digital Image Processing", Prentice Hall, 1989.

[6] A. C. Kokaram, R. D. Morris, W. J. Fitzgerald, P. J. W. Rayner, "Interpolation of Missing Data in Image Sequences", IEEE Trans. On Image Processing, vol 4 no. 11, pp. 1509-1519, Nov. 1995.

[7] 3L Ltd., "PARALLEL C User Guide for Texas Instruments TMS320C40", 1995

[8] Vipin Kumar, Ananth Y Grama and Nageshwara Rao Vempaty, "Scalable Load Balancing Techniques for Parallel Computers", Journal of Parallel and Distributed Computing, pp60-79,1994.

[9] S. Geman and D.McClure, "A Nonlinear Filter for Film Restoration and other problems in Image Processing", CGVIP, Graphical Models and Image Processing, pp. 281-289, July 1992.

SUMMARY


This application report describes a video restoration algorithm and its implementation on a network of 15 Texas Instruments TMS320C40 Digital Signal Processors. The video restoration algorithm used is a Gaussian Weighted, Bi-directional 3D Auto-Regressive (B3D-AR) model. This restoration algorithm alleviates the presence of noise in the old video archives. Common forms of degradation found in such archives are "dirt and sparkle" and scratches. The distortion is caused either by the accumulation of dirt, or the degradation of the films due to chemical process, or by the film material being abraded.

This parallel video restoration system is shown to have better restoration quality when tested on a set of image sequences. The results and analysis show that B3D-AR model is capable of restoring noise corrupted video. While, the Gaussian Weighting scheme provides good spatial support for the model, the bi-directional scheme prevents the progressive degradation of image sequences due to the corruption in regions exhibiting different motion processes, such as occlusion, zooming, rotation and panning. The video restoration has been tested on different image sequences containing different noise processes such as variable-size blotches and line scratches. Its effectiveness in the restoration of these different noise artefacts has been demonstrated. More importantly, when the system is applied to a natural degraded (PAL-size) video, the noise level of the image sequence is significantly reduced while retaining the crisp and sharpness of the original image sequence. The results are in contrast with most of the existing image restoration algorithms which will blur edges of moving objects in the vicinity of occluded and uncovered image regions; the video restoration algorithm described here can successfully suppress mixed noise processes and recover lost signals in both the covered and uncovered regions in image sequences.

Parallel implementation of the proposed algorithm is realised where close to linear speed-up is achieved on a 15-node TMS320C40 system hosted by a SUN SPARC10 workstation. This application describes a fast video restoration system that distorted old archives can be digitised, restored, and transferred into new storage media with minimal human supervision.


Return to DSPS Home Page
 TI Home     Search      Feedback     Semiconductor Home

(c) Copyright 1997 Texas Instruments Incorporated. All rights reserved.
Trademarks, Important Notice!