Home | Info | Community | Development | myReactOS | Contact Us
ReactOS Development > Doxygenjidctint.c
Go to the documentation of this file.
00001 /* 00002 * jidctint.c 00003 * 00004 * Copyright (C) 1991-1998, Thomas G. Lane. 00005 * Modification developed 2002-2009 by Guido Vollbeding. 00006 * This file is part of the Independent JPEG Group's software. 00007 * For conditions of distribution and use, see the accompanying README file. 00008 * 00009 * This file contains a slow-but-accurate integer implementation of the 00010 * inverse DCT (Discrete Cosine Transform). In the IJG code, this routine 00011 * must also perform dequantization of the input coefficients. 00012 * 00013 * A 2-D IDCT can be done by 1-D IDCT on each column followed by 1-D IDCT 00014 * on each row (or vice versa, but it's more convenient to emit a row at 00015 * a time). Direct algorithms are also available, but they are much more 00016 * complex and seem not to be any faster when reduced to code. 00017 * 00018 * This implementation is based on an algorithm described in 00019 * C. Loeffler, A. Ligtenberg and G. Moschytz, "Practical Fast 1-D DCT 00020 * Algorithms with 11 Multiplications", Proc. Int'l. Conf. on Acoustics, 00021 * Speech, and Signal Processing 1989 (ICASSP '89), pp. 988-991. 00022 * The primary algorithm described there uses 11 multiplies and 29 adds. 00023 * We use their alternate method with 12 multiplies and 32 adds. 00024 * The advantage of this method is that no data path contains more than one 00025 * multiplication; this allows a very simple and accurate implementation in 00026 * scaled fixed-point arithmetic, with a minimal number of shifts. 00027 * 00028 * We also provide IDCT routines with various output sample block sizes for 00029 * direct resolution reduction or enlargement and for direct resolving the 00030 * common 2x1 and 1x2 subsampling cases without additional resampling: NxN 00031 * (N=1...16), 2NxN, and Nx2N (N=1...8) pixels for one 8x8 input DCT block. 00032 * 00033 * For N<8 we simply take the corresponding low-frequency coefficients of 00034 * the 8x8 input DCT block and apply an NxN point IDCT on the sub-block 00035 * to yield the downscaled outputs. 00036 * This can be seen as direct low-pass downsampling from the DCT domain 00037 * point of view rather than the usual spatial domain point of view, 00038 * yielding significant computational savings and results at least 00039 * as good as common bilinear (averaging) spatial downsampling. 00040 * 00041 * For N>8 we apply a partial NxN IDCT on the 8 input coefficients as 00042 * lower frequencies and higher frequencies assumed to be zero. 00043 * It turns out that the computational effort is similar to the 8x8 IDCT 00044 * regarding the output size. 00045 * Furthermore, the scaling and descaling is the same for all IDCT sizes. 00046 * 00047 * CAUTION: We rely on the FIX() macro except for the N=1,2,4,8 cases 00048 * since there would be too many additional constants to pre-calculate. 00049 */ 00050 00051 #define JPEG_INTERNALS 00052 #include "jinclude.h" 00053 #include "jpeglib.h" 00054 #include "jdct.h" /* Private declarations for DCT subsystem */ 00055 00056 #ifdef DCT_ISLOW_SUPPORTED 00057 00058 00059 /* 00060 * This module is specialized to the case DCTSIZE = 8. 00061 */ 00062 00063 #if DCTSIZE != 8 00064 Sorry, this code only copes with 8x8 DCT blocks. /* deliberate syntax err */ 00065 #endif 00066 00067 00068 /* 00069 * The poop on this scaling stuff is as follows: 00070 * 00071 * Each 1-D IDCT step produces outputs which are a factor of sqrt(N) 00072 * larger than the true IDCT outputs. The final outputs are therefore 00073 * a factor of N larger than desired; since N=8 this can be cured by 00074 * a simple right shift at the end of the algorithm. The advantage of 00075 * this arrangement is that we save two multiplications per 1-D IDCT, 00076 * because the y0 and y4 inputs need not be divided by sqrt(N). 00077 * 00078 * We have to do addition and subtraction of the integer inputs, which 00079 * is no problem, and multiplication by fractional constants, which is 00080 * a problem to do in integer arithmetic. We multiply all the constants 00081 * by CONST_SCALE and convert them to integer constants (thus retaining 00082 * CONST_BITS bits of precision in the constants). After doing a 00083 * multiplication we have to divide the product by CONST_SCALE, with proper 00084 * rounding, to produce the correct output. This division can be done 00085 * cheaply as a right shift of CONST_BITS bits. We postpone shifting 00086 * as long as possible so that partial sums can be added together with 00087 * full fractional precision. 00088 * 00089 * The outputs of the first pass are scaled up by PASS1_BITS bits so that 00090 * they are represented to better-than-integral precision. These outputs 00091 * require BITS_IN_JSAMPLE + PASS1_BITS + 3 bits; this fits in a 16-bit word 00092 * with the recommended scaling. (To scale up 12-bit sample data further, an 00093 * intermediate INT32 array would be needed.) 00094 * 00095 * To avoid overflow of the 32-bit intermediate results in pass 2, we must 00096 * have BITS_IN_JSAMPLE + CONST_BITS + PASS1_BITS <= 26. Error analysis 00097 * shows that the values given below are the most effective. 00098 */ 00099 00100 #if BITS_IN_JSAMPLE == 8 00101 #define CONST_BITS 13 00102 #define PASS1_BITS 2 00103 #else 00104 #define CONST_BITS 13 00105 #define PASS1_BITS 1 /* lose a little precision to avoid overflow */ 00106 #endif 00107 00108 /* Some C compilers fail to reduce "FIX(constant)" at compile time, thus 00109 * causing a lot of useless floating-point operations at run time. 00110 * To get around this we use the following pre-calculated constants. 00111 * If you change CONST_BITS you may want to add appropriate values. 00112 * (With a reasonable C compiler, you can just rely on the FIX() macro...) 00113 */ 00114 00115 #if CONST_BITS == 13 00116 #define FIX_0_298631336 ((INT32) 2446) /* FIX(0.298631336) */ 00117 #define FIX_0_390180644 ((INT32) 3196) /* FIX(0.390180644) */ 00118 #define FIX_0_541196100 ((INT32) 4433) /* FIX(0.541196100) */ 00119 #define FIX_0_765366865 ((INT32) 6270) /* FIX(0.765366865) */ 00120 #define FIX_0_899976223 ((INT32) 7373) /* FIX(0.899976223) */ 00121 #define FIX_1_175875602 ((INT32) 9633) /* FIX(1.175875602) */ 00122 #define FIX_1_501321110 ((INT32) 12299) /* FIX(1.501321110) */ 00123 #define FIX_1_847759065 ((INT32) 15137) /* FIX(1.847759065) */ 00124 #define FIX_1_961570560 ((INT32) 16069) /* FIX(1.961570560) */ 00125 #define FIX_2_053119869 ((INT32) 16819) /* FIX(2.053119869) */ 00126 #define FIX_2_562915447 ((INT32) 20995) /* FIX(2.562915447) */ 00127 #define FIX_3_072711026 ((INT32) 25172) /* FIX(3.072711026) */ 00128 #else 00129 #define FIX_0_298631336 FIX(0.298631336) 00130 #define FIX_0_390180644 FIX(0.390180644) 00131 #define FIX_0_541196100 FIX(0.541196100) 00132 #define FIX_0_765366865 FIX(0.765366865) 00133 #define FIX_0_899976223 FIX(0.899976223) 00134 #define FIX_1_175875602 FIX(1.175875602) 00135 #define FIX_1_501321110 FIX(1.501321110) 00136 #define FIX_1_847759065 FIX(1.847759065) 00137 #define FIX_1_961570560 FIX(1.961570560) 00138 #define FIX_2_053119869 FIX(2.053119869) 00139 #define FIX_2_562915447 FIX(2.562915447) 00140 #define FIX_3_072711026 FIX(3.072711026) 00141 #endif 00142 00143 00144 /* Multiply an INT32 variable by an INT32 constant to yield an INT32 result. 00145 * For 8-bit samples with the recommended scaling, all the variable 00146 * and constant values involved are no more than 16 bits wide, so a 00147 * 16x16->32 bit multiply can be used instead of a full 32x32 multiply. 00148 * For 12-bit samples, a full 32-bit multiplication will be needed. 00149 */ 00150 00151 #if BITS_IN_JSAMPLE == 8 00152 #define MULTIPLY(var,const) MULTIPLY16C16(var,const) 00153 #else 00154 #define MULTIPLY(var,const) ((var) * (const)) 00155 #endif 00156 00157 00158 /* Dequantize a coefficient by multiplying it by the multiplier-table 00159 * entry; produce an int result. In this module, both inputs and result 00160 * are 16 bits or less, so either int or short multiply will work. 00161 */ 00162 00163 #define DEQUANTIZE(coef,quantval) (((ISLOW_MULT_TYPE) (coef)) * (quantval)) 00164 00165 00166 /* 00167 * Perform dequantization and inverse DCT on one block of coefficients. 00168 */ 00169 00170 GLOBAL(void) 00171 jpeg_idct_islow (j_decompress_ptr cinfo, jpeg_component_info * compptr, 00172 JCOEFPTR coef_block, 00173 JSAMPARRAY output_buf, JDIMENSION output_col) 00174 { 00175 INT32 tmp0, tmp1, tmp2, tmp3; 00176 INT32 tmp10, tmp11, tmp12, tmp13; 00177 INT32 z1, z2, z3; 00178 JCOEFPTR inptr; 00179 ISLOW_MULT_TYPE * quantptr; 00180 int * wsptr; 00181 JSAMPROW outptr; 00182 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 00183 int ctr; 00184 int workspace[DCTSIZE2]; /* buffers data between passes */ 00185 SHIFT_TEMPS 00186 00187 /* Pass 1: process columns from input, store into work array. */ 00188 /* Note results are scaled up by sqrt(8) compared to a true IDCT; */ 00189 /* furthermore, we scale the results by 2**PASS1_BITS. */ 00190 00191 inptr = coef_block; 00192 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 00193 wsptr = workspace; 00194 for (ctr = DCTSIZE; ctr > 0; ctr--) { 00195 /* Due to quantization, we will usually find that many of the input 00196 * coefficients are zero, especially the AC terms. We can exploit this 00197 * by short-circuiting the IDCT calculation for any column in which all 00198 * the AC terms are zero. In that case each output is equal to the 00199 * DC coefficient (with scale factor as needed). 00200 * With typical images and quantization tables, half or more of the 00201 * column DCT calculations can be simplified this way. 00202 */ 00203 00204 if (inptr[DCTSIZE*1] == 0 && inptr[DCTSIZE*2] == 0 && 00205 inptr[DCTSIZE*3] == 0 && inptr[DCTSIZE*4] == 0 && 00206 inptr[DCTSIZE*5] == 0 && inptr[DCTSIZE*6] == 0 && 00207 inptr[DCTSIZE*7] == 0) { 00208 /* AC terms all zero */ 00209 int dcval = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]) << PASS1_BITS; 00210 00211 wsptr[DCTSIZE*0] = dcval; 00212 wsptr[DCTSIZE*1] = dcval; 00213 wsptr[DCTSIZE*2] = dcval; 00214 wsptr[DCTSIZE*3] = dcval; 00215 wsptr[DCTSIZE*4] = dcval; 00216 wsptr[DCTSIZE*5] = dcval; 00217 wsptr[DCTSIZE*6] = dcval; 00218 wsptr[DCTSIZE*7] = dcval; 00219 00220 inptr++; /* advance pointers to next column */ 00221 quantptr++; 00222 wsptr++; 00223 continue; 00224 } 00225 00226 /* Even part: reverse the even part of the forward DCT. */ 00227 /* The rotator is sqrt(2)*c(-6). */ 00228 00229 z2 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 00230 z3 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]); 00231 00232 z1 = MULTIPLY(z2 + z3, FIX_0_541196100); 00233 tmp2 = z1 + MULTIPLY(z2, FIX_0_765366865); 00234 tmp3 = z1 - MULTIPLY(z3, FIX_1_847759065); 00235 00236 z2 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 00237 z3 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 00238 z2 <<= CONST_BITS; 00239 z3 <<= CONST_BITS; 00240 /* Add fudge factor here for final descale. */ 00241 z2 += ONE << (CONST_BITS-PASS1_BITS-1); 00242 00243 tmp0 = z2 + z3; 00244 tmp1 = z2 - z3; 00245 00246 tmp10 = tmp0 + tmp2; 00247 tmp13 = tmp0 - tmp2; 00248 tmp11 = tmp1 + tmp3; 00249 tmp12 = tmp1 - tmp3; 00250 00251 /* Odd part per figure 8; the matrix is unitary and hence its 00252 * transpose is its inverse. i0..i3 are y7,y5,y3,y1 respectively. 00253 */ 00254 00255 tmp0 = DEQUANTIZE(inptr[DCTSIZE*7], quantptr[DCTSIZE*7]); 00256 tmp1 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]); 00257 tmp2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 00258 tmp3 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 00259 00260 z2 = tmp0 + tmp2; 00261 z3 = tmp1 + tmp3; 00262 00263 z1 = MULTIPLY(z2 + z3, FIX_1_175875602); /* sqrt(2) * c3 */ 00264 z2 = MULTIPLY(z2, - FIX_1_961570560); /* sqrt(2) * (-c3-c5) */ 00265 z3 = MULTIPLY(z3, - FIX_0_390180644); /* sqrt(2) * (c5-c3) */ 00266 z2 += z1; 00267 z3 += z1; 00268 00269 z1 = MULTIPLY(tmp0 + tmp3, - FIX_0_899976223); /* sqrt(2) * (c7-c3) */ 00270 tmp0 = MULTIPLY(tmp0, FIX_0_298631336); /* sqrt(2) * (-c1+c3+c5-c7) */ 00271 tmp3 = MULTIPLY(tmp3, FIX_1_501321110); /* sqrt(2) * ( c1+c3-c5-c7) */ 00272 tmp0 += z1 + z2; 00273 tmp3 += z1 + z3; 00274 00275 z1 = MULTIPLY(tmp1 + tmp2, - FIX_2_562915447); /* sqrt(2) * (-c1-c3) */ 00276 tmp1 = MULTIPLY(tmp1, FIX_2_053119869); /* sqrt(2) * ( c1+c3-c5+c7) */ 00277 tmp2 = MULTIPLY(tmp2, FIX_3_072711026); /* sqrt(2) * ( c1+c3+c5-c7) */ 00278 tmp1 += z1 + z3; 00279 tmp2 += z1 + z2; 00280 00281 /* Final output stage: inputs are tmp10..tmp13, tmp0..tmp3 */ 00282 00283 wsptr[DCTSIZE*0] = (int) RIGHT_SHIFT(tmp10 + tmp3, CONST_BITS-PASS1_BITS); 00284 wsptr[DCTSIZE*7] = (int) RIGHT_SHIFT(tmp10 - tmp3, CONST_BITS-PASS1_BITS); 00285 wsptr[DCTSIZE*1] = (int) RIGHT_SHIFT(tmp11 + tmp2, CONST_BITS-PASS1_BITS); 00286 wsptr[DCTSIZE*6] = (int) RIGHT_SHIFT(tmp11 - tmp2, CONST_BITS-PASS1_BITS); 00287 wsptr[DCTSIZE*2] = (int) RIGHT_SHIFT(tmp12 + tmp1, CONST_BITS-PASS1_BITS); 00288 wsptr[DCTSIZE*5] = (int) RIGHT_SHIFT(tmp12 - tmp1, CONST_BITS-PASS1_BITS); 00289 wsptr[DCTSIZE*3] = (int) RIGHT_SHIFT(tmp13 + tmp0, CONST_BITS-PASS1_BITS); 00290 wsptr[DCTSIZE*4] = (int) RIGHT_SHIFT(tmp13 - tmp0, CONST_BITS-PASS1_BITS); 00291 00292 inptr++; /* advance pointers to next column */ 00293 quantptr++; 00294 wsptr++; 00295 } 00296 00297 /* Pass 2: process rows from work array, store into output array. */ 00298 /* Note that we must descale the results by a factor of 8 == 2**3, */ 00299 /* and also undo the PASS1_BITS scaling. */ 00300 00301 wsptr = workspace; 00302 for (ctr = 0; ctr < DCTSIZE; ctr++) { 00303 outptr = output_buf[ctr] + output_col; 00304 /* Rows of zeroes can be exploited in the same way as we did with columns. 00305 * However, the column calculation has created many nonzero AC terms, so 00306 * the simplification applies less often (typically 5% to 10% of the time). 00307 * On machines with very fast multiplication, it's possible that the 00308 * test takes more time than it's worth. In that case this section 00309 * may be commented out. 00310 */ 00311 00312 #ifndef NO_ZERO_ROW_TEST 00313 if (wsptr[1] == 0 && wsptr[2] == 0 && wsptr[3] == 0 && wsptr[4] == 0 && 00314 wsptr[5] == 0 && wsptr[6] == 0 && wsptr[7] == 0) { 00315 /* AC terms all zero */ 00316 JSAMPLE dcval = range_limit[(int) DESCALE((INT32) wsptr[0], PASS1_BITS+3) 00317 & RANGE_MASK]; 00318 00319 outptr[0] = dcval; 00320 outptr[1] = dcval; 00321 outptr[2] = dcval; 00322 outptr[3] = dcval; 00323 outptr[4] = dcval; 00324 outptr[5] = dcval; 00325 outptr[6] = dcval; 00326 outptr[7] = dcval; 00327 00328 wsptr += DCTSIZE; /* advance pointer to next row */ 00329 continue; 00330 } 00331 #endif 00332 00333 /* Even part: reverse the even part of the forward DCT. */ 00334 /* The rotator is sqrt(2)*c(-6). */ 00335 00336 z2 = (INT32) wsptr[2]; 00337 z3 = (INT32) wsptr[6]; 00338 00339 z1 = MULTIPLY(z2 + z3, FIX_0_541196100); 00340 tmp2 = z1 + MULTIPLY(z2, FIX_0_765366865); 00341 tmp3 = z1 - MULTIPLY(z3, FIX_1_847759065); 00342 00343 /* Add fudge factor here for final descale. */ 00344 z2 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 00345 z3 = (INT32) wsptr[4]; 00346 00347 tmp0 = (z2 + z3) << CONST_BITS; 00348 tmp1 = (z2 - z3) << CONST_BITS; 00349 00350 tmp10 = tmp0 + tmp2; 00351 tmp13 = tmp0 - tmp2; 00352 tmp11 = tmp1 + tmp3; 00353 tmp12 = tmp1 - tmp3; 00354 00355 /* Odd part per figure 8; the matrix is unitary and hence its 00356 * transpose is its inverse. i0..i3 are y7,y5,y3,y1 respectively. 00357 */ 00358 00359 tmp0 = (INT32) wsptr[7]; 00360 tmp1 = (INT32) wsptr[5]; 00361 tmp2 = (INT32) wsptr[3]; 00362 tmp3 = (INT32) wsptr[1]; 00363 00364 z2 = tmp0 + tmp2; 00365 z3 = tmp1 + tmp3; 00366 00367 z1 = MULTIPLY(z2 + z3, FIX_1_175875602); /* sqrt(2) * c3 */ 00368 z2 = MULTIPLY(z2, - FIX_1_961570560); /* sqrt(2) * (-c3-c5) */ 00369 z3 = MULTIPLY(z3, - FIX_0_390180644); /* sqrt(2) * (c5-c3) */ 00370 z2 += z1; 00371 z3 += z1; 00372 00373 z1 = MULTIPLY(tmp0 + tmp3, - FIX_0_899976223); /* sqrt(2) * (c7-c3) */ 00374 tmp0 = MULTIPLY(tmp0, FIX_0_298631336); /* sqrt(2) * (-c1+c3+c5-c7) */ 00375 tmp3 = MULTIPLY(tmp3, FIX_1_501321110); /* sqrt(2) * ( c1+c3-c5-c7) */ 00376 tmp0 += z1 + z2; 00377 tmp3 += z1 + z3; 00378 00379 z1 = MULTIPLY(tmp1 + tmp2, - FIX_2_562915447); /* sqrt(2) * (-c1-c3) */ 00380 tmp1 = MULTIPLY(tmp1, FIX_2_053119869); /* sqrt(2) * ( c1+c3-c5+c7) */ 00381 tmp2 = MULTIPLY(tmp2, FIX_3_072711026); /* sqrt(2) * ( c1+c3+c5-c7) */ 00382 tmp1 += z1 + z3; 00383 tmp2 += z1 + z2; 00384 00385 /* Final output stage: inputs are tmp10..tmp13, tmp0..tmp3 */ 00386 00387 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp10 + tmp3, 00388 CONST_BITS+PASS1_BITS+3) 00389 & RANGE_MASK]; 00390 outptr[7] = range_limit[(int) RIGHT_SHIFT(tmp10 - tmp3, 00391 CONST_BITS+PASS1_BITS+3) 00392 & RANGE_MASK]; 00393 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp11 + tmp2, 00394 CONST_BITS+PASS1_BITS+3) 00395 & RANGE_MASK]; 00396 outptr[6] = range_limit[(int) RIGHT_SHIFT(tmp11 - tmp2, 00397 CONST_BITS+PASS1_BITS+3) 00398 & RANGE_MASK]; 00399 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp12 + tmp1, 00400 CONST_BITS+PASS1_BITS+3) 00401 & RANGE_MASK]; 00402 outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp12 - tmp1, 00403 CONST_BITS+PASS1_BITS+3) 00404 & RANGE_MASK]; 00405 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp13 + tmp0, 00406 CONST_BITS+PASS1_BITS+3) 00407 & RANGE_MASK]; 00408 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp13 - tmp0, 00409 CONST_BITS+PASS1_BITS+3) 00410 & RANGE_MASK]; 00411 00412 wsptr += DCTSIZE; /* advance pointer to next row */ 00413 } 00414 } 00415 00416 #ifdef IDCT_SCALING_SUPPORTED 00417 00418 00419 /* 00420 * Perform dequantization and inverse DCT on one block of coefficients, 00421 * producing a 7x7 output block. 00422 * 00423 * Optimized algorithm with 12 multiplications in the 1-D kernel. 00424 * cK represents sqrt(2) * cos(K*pi/14). 00425 */ 00426 00427 GLOBAL(void) 00428 jpeg_idct_7x7 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 00429 JCOEFPTR coef_block, 00430 JSAMPARRAY output_buf, JDIMENSION output_col) 00431 { 00432 INT32 tmp0, tmp1, tmp2, tmp10, tmp11, tmp12, tmp13; 00433 INT32 z1, z2, z3; 00434 JCOEFPTR inptr; 00435 ISLOW_MULT_TYPE * quantptr; 00436 int * wsptr; 00437 JSAMPROW outptr; 00438 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 00439 int ctr; 00440 int workspace[7*7]; /* buffers data between passes */ 00441 SHIFT_TEMPS 00442 00443 /* Pass 1: process columns from input, store into work array. */ 00444 00445 inptr = coef_block; 00446 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 00447 wsptr = workspace; 00448 for (ctr = 0; ctr < 7; ctr++, inptr++, quantptr++, wsptr++) { 00449 /* Even part */ 00450 00451 tmp13 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 00452 tmp13 <<= CONST_BITS; 00453 /* Add fudge factor here for final descale. */ 00454 tmp13 += ONE << (CONST_BITS-PASS1_BITS-1); 00455 00456 z1 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 00457 z2 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 00458 z3 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]); 00459 00460 tmp10 = MULTIPLY(z2 - z3, FIX(0.881747734)); /* c4 */ 00461 tmp12 = MULTIPLY(z1 - z2, FIX(0.314692123)); /* c6 */ 00462 tmp11 = tmp10 + tmp12 + tmp13 - MULTIPLY(z2, FIX(1.841218003)); /* c2+c4-c6 */ 00463 tmp0 = z1 + z3; 00464 z2 -= tmp0; 00465 tmp0 = MULTIPLY(tmp0, FIX(1.274162392)) + tmp13; /* c2 */ 00466 tmp10 += tmp0 - MULTIPLY(z3, FIX(0.077722536)); /* c2-c4-c6 */ 00467 tmp12 += tmp0 - MULTIPLY(z1, FIX(2.470602249)); /* c2+c4+c6 */ 00468 tmp13 += MULTIPLY(z2, FIX(1.414213562)); /* c0 */ 00469 00470 /* Odd part */ 00471 00472 z1 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 00473 z2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 00474 z3 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]); 00475 00476 tmp1 = MULTIPLY(z1 + z2, FIX(0.935414347)); /* (c3+c1-c5)/2 */ 00477 tmp2 = MULTIPLY(z1 - z2, FIX(0.170262339)); /* (c3+c5-c1)/2 */ 00478 tmp0 = tmp1 - tmp2; 00479 tmp1 += tmp2; 00480 tmp2 = MULTIPLY(z2 + z3, - FIX(1.378756276)); /* -c1 */ 00481 tmp1 += tmp2; 00482 z2 = MULTIPLY(z1 + z3, FIX(0.613604268)); /* c5 */ 00483 tmp0 += z2; 00484 tmp2 += z2 + MULTIPLY(z3, FIX(1.870828693)); /* c3+c1-c5 */ 00485 00486 /* Final output stage */ 00487 00488 wsptr[7*0] = (int) RIGHT_SHIFT(tmp10 + tmp0, CONST_BITS-PASS1_BITS); 00489 wsptr[7*6] = (int) RIGHT_SHIFT(tmp10 - tmp0, CONST_BITS-PASS1_BITS); 00490 wsptr[7*1] = (int) RIGHT_SHIFT(tmp11 + tmp1, CONST_BITS-PASS1_BITS); 00491 wsptr[7*5] = (int) RIGHT_SHIFT(tmp11 - tmp1, CONST_BITS-PASS1_BITS); 00492 wsptr[7*2] = (int) RIGHT_SHIFT(tmp12 + tmp2, CONST_BITS-PASS1_BITS); 00493 wsptr[7*4] = (int) RIGHT_SHIFT(tmp12 - tmp2, CONST_BITS-PASS1_BITS); 00494 wsptr[7*3] = (int) RIGHT_SHIFT(tmp13, CONST_BITS-PASS1_BITS); 00495 } 00496 00497 /* Pass 2: process 7 rows from work array, store into output array. */ 00498 00499 wsptr = workspace; 00500 for (ctr = 0; ctr < 7; ctr++) { 00501 outptr = output_buf[ctr] + output_col; 00502 00503 /* Even part */ 00504 00505 /* Add fudge factor here for final descale. */ 00506 tmp13 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 00507 tmp13 <<= CONST_BITS; 00508 00509 z1 = (INT32) wsptr[2]; 00510 z2 = (INT32) wsptr[4]; 00511 z3 = (INT32) wsptr[6]; 00512 00513 tmp10 = MULTIPLY(z2 - z3, FIX(0.881747734)); /* c4 */ 00514 tmp12 = MULTIPLY(z1 - z2, FIX(0.314692123)); /* c6 */ 00515 tmp11 = tmp10 + tmp12 + tmp13 - MULTIPLY(z2, FIX(1.841218003)); /* c2+c4-c6 */ 00516 tmp0 = z1 + z3; 00517 z2 -= tmp0; 00518 tmp0 = MULTIPLY(tmp0, FIX(1.274162392)) + tmp13; /* c2 */ 00519 tmp10 += tmp0 - MULTIPLY(z3, FIX(0.077722536)); /* c2-c4-c6 */ 00520 tmp12 += tmp0 - MULTIPLY(z1, FIX(2.470602249)); /* c2+c4+c6 */ 00521 tmp13 += MULTIPLY(z2, FIX(1.414213562)); /* c0 */ 00522 00523 /* Odd part */ 00524 00525 z1 = (INT32) wsptr[1]; 00526 z2 = (INT32) wsptr[3]; 00527 z3 = (INT32) wsptr[5]; 00528 00529 tmp1 = MULTIPLY(z1 + z2, FIX(0.935414347)); /* (c3+c1-c5)/2 */ 00530 tmp2 = MULTIPLY(z1 - z2, FIX(0.170262339)); /* (c3+c5-c1)/2 */ 00531 tmp0 = tmp1 - tmp2; 00532 tmp1 += tmp2; 00533 tmp2 = MULTIPLY(z2 + z3, - FIX(1.378756276)); /* -c1 */ 00534 tmp1 += tmp2; 00535 z2 = MULTIPLY(z1 + z3, FIX(0.613604268)); /* c5 */ 00536 tmp0 += z2; 00537 tmp2 += z2 + MULTIPLY(z3, FIX(1.870828693)); /* c3+c1-c5 */ 00538 00539 /* Final output stage */ 00540 00541 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp10 + tmp0, 00542 CONST_BITS+PASS1_BITS+3) 00543 & RANGE_MASK]; 00544 outptr[6] = range_limit[(int) RIGHT_SHIFT(tmp10 - tmp0, 00545 CONST_BITS+PASS1_BITS+3) 00546 & RANGE_MASK]; 00547 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp11 + tmp1, 00548 CONST_BITS+PASS1_BITS+3) 00549 & RANGE_MASK]; 00550 outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp11 - tmp1, 00551 CONST_BITS+PASS1_BITS+3) 00552 & RANGE_MASK]; 00553 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp12 + tmp2, 00554 CONST_BITS+PASS1_BITS+3) 00555 & RANGE_MASK]; 00556 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp12 - tmp2, 00557 CONST_BITS+PASS1_BITS+3) 00558 & RANGE_MASK]; 00559 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp13, 00560 CONST_BITS+PASS1_BITS+3) 00561 & RANGE_MASK]; 00562 00563 wsptr += 7; /* advance pointer to next row */ 00564 } 00565 } 00566 00567 00568 /* 00569 * Perform dequantization and inverse DCT on one block of coefficients, 00570 * producing a reduced-size 6x6 output block. 00571 * 00572 * Optimized algorithm with 3 multiplications in the 1-D kernel. 00573 * cK represents sqrt(2) * cos(K*pi/12). 00574 */ 00575 00576 GLOBAL(void) 00577 jpeg_idct_6x6 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 00578 JCOEFPTR coef_block, 00579 JSAMPARRAY output_buf, JDIMENSION output_col) 00580 { 00581 INT32 tmp0, tmp1, tmp2, tmp10, tmp11, tmp12; 00582 INT32 z1, z2, z3; 00583 JCOEFPTR inptr; 00584 ISLOW_MULT_TYPE * quantptr; 00585 int * wsptr; 00586 JSAMPROW outptr; 00587 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 00588 int ctr; 00589 int workspace[6*6]; /* buffers data between passes */ 00590 SHIFT_TEMPS 00591 00592 /* Pass 1: process columns from input, store into work array. */ 00593 00594 inptr = coef_block; 00595 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 00596 wsptr = workspace; 00597 for (ctr = 0; ctr < 6; ctr++, inptr++, quantptr++, wsptr++) { 00598 /* Even part */ 00599 00600 tmp0 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 00601 tmp0 <<= CONST_BITS; 00602 /* Add fudge factor here for final descale. */ 00603 tmp0 += ONE << (CONST_BITS-PASS1_BITS-1); 00604 tmp2 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 00605 tmp10 = MULTIPLY(tmp2, FIX(0.707106781)); /* c4 */ 00606 tmp1 = tmp0 + tmp10; 00607 tmp11 = RIGHT_SHIFT(tmp0 - tmp10 - tmp10, CONST_BITS-PASS1_BITS); 00608 tmp10 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 00609 tmp0 = MULTIPLY(tmp10, FIX(1.224744871)); /* c2 */ 00610 tmp10 = tmp1 + tmp0; 00611 tmp12 = tmp1 - tmp0; 00612 00613 /* Odd part */ 00614 00615 z1 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 00616 z2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 00617 z3 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]); 00618 tmp1 = MULTIPLY(z1 + z3, FIX(0.366025404)); /* c5 */ 00619 tmp0 = tmp1 + ((z1 + z2) << CONST_BITS); 00620 tmp2 = tmp1 + ((z3 - z2) << CONST_BITS); 00621 tmp1 = (z1 - z2 - z3) << PASS1_BITS; 00622 00623 /* Final output stage */ 00624 00625 wsptr[6*0] = (int) RIGHT_SHIFT(tmp10 + tmp0, CONST_BITS-PASS1_BITS); 00626 wsptr[6*5] = (int) RIGHT_SHIFT(tmp10 - tmp0, CONST_BITS-PASS1_BITS); 00627 wsptr[6*1] = (int) (tmp11 + tmp1); 00628 wsptr[6*4] = (int) (tmp11 - tmp1); 00629 wsptr[6*2] = (int) RIGHT_SHIFT(tmp12 + tmp2, CONST_BITS-PASS1_BITS); 00630 wsptr[6*3] = (int) RIGHT_SHIFT(tmp12 - tmp2, CONST_BITS-PASS1_BITS); 00631 } 00632 00633 /* Pass 2: process 6 rows from work array, store into output array. */ 00634 00635 wsptr = workspace; 00636 for (ctr = 0; ctr < 6; ctr++) { 00637 outptr = output_buf[ctr] + output_col; 00638 00639 /* Even part */ 00640 00641 /* Add fudge factor here for final descale. */ 00642 tmp0 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 00643 tmp0 <<= CONST_BITS; 00644 tmp2 = (INT32) wsptr[4]; 00645 tmp10 = MULTIPLY(tmp2, FIX(0.707106781)); /* c4 */ 00646 tmp1 = tmp0 + tmp10; 00647 tmp11 = tmp0 - tmp10 - tmp10; 00648 tmp10 = (INT32) wsptr[2]; 00649 tmp0 = MULTIPLY(tmp10, FIX(1.224744871)); /* c2 */ 00650 tmp10 = tmp1 + tmp0; 00651 tmp12 = tmp1 - tmp0; 00652 00653 /* Odd part */ 00654 00655 z1 = (INT32) wsptr[1]; 00656 z2 = (INT32) wsptr[3]; 00657 z3 = (INT32) wsptr[5]; 00658 tmp1 = MULTIPLY(z1 + z3, FIX(0.366025404)); /* c5 */ 00659 tmp0 = tmp1 + ((z1 + z2) << CONST_BITS); 00660 tmp2 = tmp1 + ((z3 - z2) << CONST_BITS); 00661 tmp1 = (z1 - z2 - z3) << CONST_BITS; 00662 00663 /* Final output stage */ 00664 00665 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp10 + tmp0, 00666 CONST_BITS+PASS1_BITS+3) 00667 & RANGE_MASK]; 00668 outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp10 - tmp0, 00669 CONST_BITS+PASS1_BITS+3) 00670 & RANGE_MASK]; 00671 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp11 + tmp1, 00672 CONST_BITS+PASS1_BITS+3) 00673 & RANGE_MASK]; 00674 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp11 - tmp1, 00675 CONST_BITS+PASS1_BITS+3) 00676 & RANGE_MASK]; 00677 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp12 + tmp2, 00678 CONST_BITS+PASS1_BITS+3) 00679 & RANGE_MASK]; 00680 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp12 - tmp2, 00681 CONST_BITS+PASS1_BITS+3) 00682 & RANGE_MASK]; 00683 00684 wsptr += 6; /* advance pointer to next row */ 00685 } 00686 } 00687 00688 00689 /* 00690 * Perform dequantization and inverse DCT on one block of coefficients, 00691 * producing a reduced-size 5x5 output block. 00692 * 00693 * Optimized algorithm with 5 multiplications in the 1-D kernel. 00694 * cK represents sqrt(2) * cos(K*pi/10). 00695 */ 00696 00697 GLOBAL(void) 00698 jpeg_idct_5x5 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 00699 JCOEFPTR coef_block, 00700 JSAMPARRAY output_buf, JDIMENSION output_col) 00701 { 00702 INT32 tmp0, tmp1, tmp10, tmp11, tmp12; 00703 INT32 z1, z2, z3; 00704 JCOEFPTR inptr; 00705 ISLOW_MULT_TYPE * quantptr; 00706 int * wsptr; 00707 JSAMPROW outptr; 00708 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 00709 int ctr; 00710 int workspace[5*5]; /* buffers data between passes */ 00711 SHIFT_TEMPS 00712 00713 /* Pass 1: process columns from input, store into work array. */ 00714 00715 inptr = coef_block; 00716 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 00717 wsptr = workspace; 00718 for (ctr = 0; ctr < 5; ctr++, inptr++, quantptr++, wsptr++) { 00719 /* Even part */ 00720 00721 tmp12 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 00722 tmp12 <<= CONST_BITS; 00723 /* Add fudge factor here for final descale. */ 00724 tmp12 += ONE << (CONST_BITS-PASS1_BITS-1); 00725 tmp0 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 00726 tmp1 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 00727 z1 = MULTIPLY(tmp0 + tmp1, FIX(0.790569415)); /* (c2+c4)/2 */ 00728 z2 = MULTIPLY(tmp0 - tmp1, FIX(0.353553391)); /* (c2-c4)/2 */ 00729 z3 = tmp12 + z2; 00730 tmp10 = z3 + z1; 00731 tmp11 = z3 - z1; 00732 tmp12 -= z2 << 2; 00733 00734 /* Odd part */ 00735 00736 z2 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 00737 z3 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 00738 00739 z1 = MULTIPLY(z2 + z3, FIX(0.831253876)); /* c3 */ 00740 tmp0 = z1 + MULTIPLY(z2, FIX(0.513743148)); /* c1-c3 */ 00741 tmp1 = z1 - MULTIPLY(z3, FIX(2.176250899)); /* c1+c3 */ 00742 00743 /* Final output stage */ 00744 00745 wsptr[5*0] = (int) RIGHT_SHIFT(tmp10 + tmp0, CONST_BITS-PASS1_BITS); 00746 wsptr[5*4] = (int) RIGHT_SHIFT(tmp10 - tmp0, CONST_BITS-PASS1_BITS); 00747 wsptr[5*1] = (int) RIGHT_SHIFT(tmp11 + tmp1, CONST_BITS-PASS1_BITS); 00748 wsptr[5*3] = (int) RIGHT_SHIFT(tmp11 - tmp1, CONST_BITS-PASS1_BITS); 00749 wsptr[5*2] = (int) RIGHT_SHIFT(tmp12, CONST_BITS-PASS1_BITS); 00750 } 00751 00752 /* Pass 2: process 5 rows from work array, store into output array. */ 00753 00754 wsptr = workspace; 00755 for (ctr = 0; ctr < 5; ctr++) { 00756 outptr = output_buf[ctr] + output_col; 00757 00758 /* Even part */ 00759 00760 /* Add fudge factor here for final descale. */ 00761 tmp12 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 00762 tmp12 <<= CONST_BITS; 00763 tmp0 = (INT32) wsptr[2]; 00764 tmp1 = (INT32) wsptr[4]; 00765 z1 = MULTIPLY(tmp0 + tmp1, FIX(0.790569415)); /* (c2+c4)/2 */ 00766 z2 = MULTIPLY(tmp0 - tmp1, FIX(0.353553391)); /* (c2-c4)/2 */ 00767 z3 = tmp12 + z2; 00768 tmp10 = z3 + z1; 00769 tmp11 = z3 - z1; 00770 tmp12 -= z2 << 2; 00771 00772 /* Odd part */ 00773 00774 z2 = (INT32) wsptr[1]; 00775 z3 = (INT32) wsptr[3]; 00776 00777 z1 = MULTIPLY(z2 + z3, FIX(0.831253876)); /* c3 */ 00778 tmp0 = z1 + MULTIPLY(z2, FIX(0.513743148)); /* c1-c3 */ 00779 tmp1 = z1 - MULTIPLY(z3, FIX(2.176250899)); /* c1+c3 */ 00780 00781 /* Final output stage */ 00782 00783 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp10 + tmp0, 00784 CONST_BITS+PASS1_BITS+3) 00785 & RANGE_MASK]; 00786 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp10 - tmp0, 00787 CONST_BITS+PASS1_BITS+3) 00788 & RANGE_MASK]; 00789 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp11 + tmp1, 00790 CONST_BITS+PASS1_BITS+3) 00791 & RANGE_MASK]; 00792 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp11 - tmp1, 00793 CONST_BITS+PASS1_BITS+3) 00794 & RANGE_MASK]; 00795 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp12, 00796 CONST_BITS+PASS1_BITS+3) 00797 & RANGE_MASK]; 00798 00799 wsptr += 5; /* advance pointer to next row */ 00800 } 00801 } 00802 00803 00804 /* 00805 * Perform dequantization and inverse DCT on one block of coefficients, 00806 * producing a reduced-size 4x4 output block. 00807 * 00808 * Optimized algorithm with 3 multiplications in the 1-D kernel. 00809 * cK represents sqrt(2) * cos(K*pi/16) [refers to 8-point IDCT]. 00810 */ 00811 00812 GLOBAL(void) 00813 jpeg_idct_4x4 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 00814 JCOEFPTR coef_block, 00815 JSAMPARRAY output_buf, JDIMENSION output_col) 00816 { 00817 INT32 tmp0, tmp2, tmp10, tmp12; 00818 INT32 z1, z2, z3; 00819 JCOEFPTR inptr; 00820 ISLOW_MULT_TYPE * quantptr; 00821 int * wsptr; 00822 JSAMPROW outptr; 00823 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 00824 int ctr; 00825 int workspace[4*4]; /* buffers data between passes */ 00826 SHIFT_TEMPS 00827 00828 /* Pass 1: process columns from input, store into work array. */ 00829 00830 inptr = coef_block; 00831 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 00832 wsptr = workspace; 00833 for (ctr = 0; ctr < 4; ctr++, inptr++, quantptr++, wsptr++) { 00834 /* Even part */ 00835 00836 tmp0 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 00837 tmp2 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 00838 00839 tmp10 = (tmp0 + tmp2) << PASS1_BITS; 00840 tmp12 = (tmp0 - tmp2) << PASS1_BITS; 00841 00842 /* Odd part */ 00843 /* Same rotation as in the even part of the 8x8 LL&M IDCT */ 00844 00845 z2 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 00846 z3 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 00847 00848 z1 = MULTIPLY(z2 + z3, FIX_0_541196100); /* c6 */ 00849 /* Add fudge factor here for final descale. */ 00850 z1 += ONE << (CONST_BITS-PASS1_BITS-1); 00851 tmp0 = RIGHT_SHIFT(z1 + MULTIPLY(z2, FIX_0_765366865), /* c2-c6 */ 00852 CONST_BITS-PASS1_BITS); 00853 tmp2 = RIGHT_SHIFT(z1 - MULTIPLY(z3, FIX_1_847759065), /* c2+c6 */ 00854 CONST_BITS-PASS1_BITS); 00855 00856 /* Final output stage */ 00857 00858 wsptr[4*0] = (int) (tmp10 + tmp0); 00859 wsptr[4*3] = (int) (tmp10 - tmp0); 00860 wsptr[4*1] = (int) (tmp12 + tmp2); 00861 wsptr[4*2] = (int) (tmp12 - tmp2); 00862 } 00863 00864 /* Pass 2: process 4 rows from work array, store into output array. */ 00865 00866 wsptr = workspace; 00867 for (ctr = 0; ctr < 4; ctr++) { 00868 outptr = output_buf[ctr] + output_col; 00869 00870 /* Even part */ 00871 00872 /* Add fudge factor here for final descale. */ 00873 tmp0 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 00874 tmp2 = (INT32) wsptr[2]; 00875 00876 tmp10 = (tmp0 + tmp2) << CONST_BITS; 00877 tmp12 = (tmp0 - tmp2) << CONST_BITS; 00878 00879 /* Odd part */ 00880 /* Same rotation as in the even part of the 8x8 LL&M IDCT */ 00881 00882 z2 = (INT32) wsptr[1]; 00883 z3 = (INT32) wsptr[3]; 00884 00885 z1 = MULTIPLY(z2 + z3, FIX_0_541196100); /* c6 */ 00886 tmp0 = z1 + MULTIPLY(z2, FIX_0_765366865); /* c2-c6 */ 00887 tmp2 = z1 - MULTIPLY(z3, FIX_1_847759065); /* c2+c6 */ 00888 00889 /* Final output stage */ 00890 00891 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp10 + tmp0, 00892 CONST_BITS+PASS1_BITS+3) 00893 & RANGE_MASK]; 00894 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp10 - tmp0, 00895 CONST_BITS+PASS1_BITS+3) 00896 & RANGE_MASK]; 00897 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp12 + tmp2, 00898 CONST_BITS+PASS1_BITS+3) 00899 & RANGE_MASK]; 00900 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp12 - tmp2, 00901 CONST_BITS+PASS1_BITS+3) 00902 & RANGE_MASK]; 00903 00904 wsptr += 4; /* advance pointer to next row */ 00905 } 00906 } 00907 00908 00909 /* 00910 * Perform dequantization and inverse DCT on one block of coefficients, 00911 * producing a reduced-size 3x3 output block. 00912 * 00913 * Optimized algorithm with 2 multiplications in the 1-D kernel. 00914 * cK represents sqrt(2) * cos(K*pi/6). 00915 */ 00916 00917 GLOBAL(void) 00918 jpeg_idct_3x3 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 00919 JCOEFPTR coef_block, 00920 JSAMPARRAY output_buf, JDIMENSION output_col) 00921 { 00922 INT32 tmp0, tmp2, tmp10, tmp12; 00923 JCOEFPTR inptr; 00924 ISLOW_MULT_TYPE * quantptr; 00925 int * wsptr; 00926 JSAMPROW outptr; 00927 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 00928 int ctr; 00929 int workspace[3*3]; /* buffers data between passes */ 00930 SHIFT_TEMPS 00931 00932 /* Pass 1: process columns from input, store into work array. */ 00933 00934 inptr = coef_block; 00935 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 00936 wsptr = workspace; 00937 for (ctr = 0; ctr < 3; ctr++, inptr++, quantptr++, wsptr++) { 00938 /* Even part */ 00939 00940 tmp0 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 00941 tmp0 <<= CONST_BITS; 00942 /* Add fudge factor here for final descale. */ 00943 tmp0 += ONE << (CONST_BITS-PASS1_BITS-1); 00944 tmp2 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 00945 tmp12 = MULTIPLY(tmp2, FIX(0.707106781)); /* c2 */ 00946 tmp10 = tmp0 + tmp12; 00947 tmp2 = tmp0 - tmp12 - tmp12; 00948 00949 /* Odd part */ 00950 00951 tmp12 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 00952 tmp0 = MULTIPLY(tmp12, FIX(1.224744871)); /* c1 */ 00953 00954 /* Final output stage */ 00955 00956 wsptr[3*0] = (int) RIGHT_SHIFT(tmp10 + tmp0, CONST_BITS-PASS1_BITS); 00957 wsptr[3*2] = (int) RIGHT_SHIFT(tmp10 - tmp0, CONST_BITS-PASS1_BITS); 00958 wsptr[3*1] = (int) RIGHT_SHIFT(tmp2, CONST_BITS-PASS1_BITS); 00959 } 00960 00961 /* Pass 2: process 3 rows from work array, store into output array. */ 00962 00963 wsptr = workspace; 00964 for (ctr = 0; ctr < 3; ctr++) { 00965 outptr = output_buf[ctr] + output_col; 00966 00967 /* Even part */ 00968 00969 /* Add fudge factor here for final descale. */ 00970 tmp0 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 00971 tmp0 <<= CONST_BITS; 00972 tmp2 = (INT32) wsptr[2]; 00973 tmp12 = MULTIPLY(tmp2, FIX(0.707106781)); /* c2 */ 00974 tmp10 = tmp0 + tmp12; 00975 tmp2 = tmp0 - tmp12 - tmp12; 00976 00977 /* Odd part */ 00978 00979 tmp12 = (INT32) wsptr[1]; 00980 tmp0 = MULTIPLY(tmp12, FIX(1.224744871)); /* c1 */ 00981 00982 /* Final output stage */ 00983 00984 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp10 + tmp0, 00985 CONST_BITS+PASS1_BITS+3) 00986 & RANGE_MASK]; 00987 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp10 - tmp0, 00988 CONST_BITS+PASS1_BITS+3) 00989 & RANGE_MASK]; 00990 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp2, 00991 CONST_BITS+PASS1_BITS+3) 00992 & RANGE_MASK]; 00993 00994 wsptr += 3; /* advance pointer to next row */ 00995 } 00996 } 00997 00998 00999 /* 01000 * Perform dequantization and inverse DCT on one block of coefficients, 01001 * producing a reduced-size 2x2 output block. 01002 * 01003 * Multiplication-less algorithm. 01004 */ 01005 01006 GLOBAL(void) 01007 jpeg_idct_2x2 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 01008 JCOEFPTR coef_block, 01009 JSAMPARRAY output_buf, JDIMENSION output_col) 01010 { 01011 INT32 tmp0, tmp1, tmp2, tmp3, tmp4, tmp5; 01012 ISLOW_MULT_TYPE * quantptr; 01013 JSAMPROW outptr; 01014 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 01015 SHIFT_TEMPS 01016 01017 /* Pass 1: process columns from input. */ 01018 01019 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 01020 01021 /* Column 0 */ 01022 tmp4 = DEQUANTIZE(coef_block[DCTSIZE*0], quantptr[DCTSIZE*0]); 01023 tmp5 = DEQUANTIZE(coef_block[DCTSIZE*1], quantptr[DCTSIZE*1]); 01024 /* Add fudge factor here for final descale. */ 01025 tmp4 += ONE << 2; 01026 01027 tmp0 = tmp4 + tmp5; 01028 tmp2 = tmp4 - tmp5; 01029 01030 /* Column 1 */ 01031 tmp4 = DEQUANTIZE(coef_block[DCTSIZE*0+1], quantptr[DCTSIZE*0+1]); 01032 tmp5 = DEQUANTIZE(coef_block[DCTSIZE*1+1], quantptr[DCTSIZE*1+1]); 01033 01034 tmp1 = tmp4 + tmp5; 01035 tmp3 = tmp4 - tmp5; 01036 01037 /* Pass 2: process 2 rows, store into output array. */ 01038 01039 /* Row 0 */ 01040 outptr = output_buf[0] + output_col; 01041 01042 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp0 + tmp1, 3) & RANGE_MASK]; 01043 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp0 - tmp1, 3) & RANGE_MASK]; 01044 01045 /* Row 1 */ 01046 outptr = output_buf[1] + output_col; 01047 01048 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp2 + tmp3, 3) & RANGE_MASK]; 01049 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp2 - tmp3, 3) & RANGE_MASK]; 01050 } 01051 01052 01053 /* 01054 * Perform dequantization and inverse DCT on one block of coefficients, 01055 * producing a reduced-size 1x1 output block. 01056 * 01057 * We hardly need an inverse DCT routine for this: just take the 01058 * average pixel value, which is one-eighth of the DC coefficient. 01059 */ 01060 01061 GLOBAL(void) 01062 jpeg_idct_1x1 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 01063 JCOEFPTR coef_block, 01064 JSAMPARRAY output_buf, JDIMENSION output_col) 01065 { 01066 int dcval; 01067 ISLOW_MULT_TYPE * quantptr; 01068 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 01069 SHIFT_TEMPS 01070 01071 /* 1x1 is trivial: just take the DC coefficient divided by 8. */ 01072 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 01073 dcval = DEQUANTIZE(coef_block[0], quantptr[0]); 01074 dcval = (int) DESCALE((INT32) dcval, 3); 01075 01076 output_buf[0][output_col] = range_limit[dcval & RANGE_MASK]; 01077 } 01078 01079 01080 /* 01081 * Perform dequantization and inverse DCT on one block of coefficients, 01082 * producing a 9x9 output block. 01083 * 01084 * Optimized algorithm with 10 multiplications in the 1-D kernel. 01085 * cK represents sqrt(2) * cos(K*pi/18). 01086 */ 01087 01088 GLOBAL(void) 01089 jpeg_idct_9x9 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 01090 JCOEFPTR coef_block, 01091 JSAMPARRAY output_buf, JDIMENSION output_col) 01092 { 01093 INT32 tmp0, tmp1, tmp2, tmp3, tmp10, tmp11, tmp12, tmp13, tmp14; 01094 INT32 z1, z2, z3, z4; 01095 JCOEFPTR inptr; 01096 ISLOW_MULT_TYPE * quantptr; 01097 int * wsptr; 01098 JSAMPROW outptr; 01099 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 01100 int ctr; 01101 int workspace[8*9]; /* buffers data between passes */ 01102 SHIFT_TEMPS 01103 01104 /* Pass 1: process columns from input, store into work array. */ 01105 01106 inptr = coef_block; 01107 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 01108 wsptr = workspace; 01109 for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) { 01110 /* Even part */ 01111 01112 tmp0 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 01113 tmp0 <<= CONST_BITS; 01114 /* Add fudge factor here for final descale. */ 01115 tmp0 += ONE << (CONST_BITS-PASS1_BITS-1); 01116 01117 z1 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 01118 z2 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 01119 z3 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]); 01120 01121 tmp3 = MULTIPLY(z3, FIX(0.707106781)); /* c6 */ 01122 tmp1 = tmp0 + tmp3; 01123 tmp2 = tmp0 - tmp3 - tmp3; 01124 01125 tmp0 = MULTIPLY(z1 - z2, FIX(0.707106781)); /* c6 */ 01126 tmp11 = tmp2 + tmp0; 01127 tmp14 = tmp2 - tmp0 - tmp0; 01128 01129 tmp0 = MULTIPLY(z1 + z2, FIX(1.328926049)); /* c2 */ 01130 tmp2 = MULTIPLY(z1, FIX(1.083350441)); /* c4 */ 01131 tmp3 = MULTIPLY(z2, FIX(0.245575608)); /* c8 */ 01132 01133 tmp10 = tmp1 + tmp0 - tmp3; 01134 tmp12 = tmp1 - tmp0 + tmp2; 01135 tmp13 = tmp1 - tmp2 + tmp3; 01136 01137 /* Odd part */ 01138 01139 z1 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 01140 z2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 01141 z3 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]); 01142 z4 = DEQUANTIZE(inptr[DCTSIZE*7], quantptr[DCTSIZE*7]); 01143 01144 z2 = MULTIPLY(z2, - FIX(1.224744871)); /* -c3 */ 01145 01146 tmp2 = MULTIPLY(z1 + z3, FIX(0.909038955)); /* c5 */ 01147 tmp3 = MULTIPLY(z1 + z4, FIX(0.483689525)); /* c7 */ 01148 tmp0 = tmp2 + tmp3 - z2; 01149 tmp1 = MULTIPLY(z3 - z4, FIX(1.392728481)); /* c1 */ 01150 tmp2 += z2 - tmp1; 01151 tmp3 += z2 + tmp1; 01152 tmp1 = MULTIPLY(z1 - z3 - z4, FIX(1.224744871)); /* c3 */ 01153 01154 /* Final output stage */ 01155 01156 wsptr[8*0] = (int) RIGHT_SHIFT(tmp10 + tmp0, CONST_BITS-PASS1_BITS); 01157 wsptr[8*8] = (int) RIGHT_SHIFT(tmp10 - tmp0, CONST_BITS-PASS1_BITS); 01158 wsptr[8*1] = (int) RIGHT_SHIFT(tmp11 + tmp1, CONST_BITS-PASS1_BITS); 01159 wsptr[8*7] = (int) RIGHT_SHIFT(tmp11 - tmp1, CONST_BITS-PASS1_BITS); 01160 wsptr[8*2] = (int) RIGHT_SHIFT(tmp12 + tmp2, CONST_BITS-PASS1_BITS); 01161 wsptr[8*6] = (int) RIGHT_SHIFT(tmp12 - tmp2, CONST_BITS-PASS1_BITS); 01162 wsptr[8*3] = (int) RIGHT_SHIFT(tmp13 + tmp3, CONST_BITS-PASS1_BITS); 01163 wsptr[8*5] = (int) RIGHT_SHIFT(tmp13 - tmp3, CONST_BITS-PASS1_BITS); 01164 wsptr[8*4] = (int) RIGHT_SHIFT(tmp14, CONST_BITS-PASS1_BITS); 01165 } 01166 01167 /* Pass 2: process 9 rows from work array, store into output array. */ 01168 01169 wsptr = workspace; 01170 for (ctr = 0; ctr < 9; ctr++) { 01171 outptr = output_buf[ctr] + output_col; 01172 01173 /* Even part */ 01174 01175 /* Add fudge factor here for final descale. */ 01176 tmp0 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 01177 tmp0 <<= CONST_BITS; 01178 01179 z1 = (INT32) wsptr[2]; 01180 z2 = (INT32) wsptr[4]; 01181 z3 = (INT32) wsptr[6]; 01182 01183 tmp3 = MULTIPLY(z3, FIX(0.707106781)); /* c6 */ 01184 tmp1 = tmp0 + tmp3; 01185 tmp2 = tmp0 - tmp3 - tmp3; 01186 01187 tmp0 = MULTIPLY(z1 - z2, FIX(0.707106781)); /* c6 */ 01188 tmp11 = tmp2 + tmp0; 01189 tmp14 = tmp2 - tmp0 - tmp0; 01190 01191 tmp0 = MULTIPLY(z1 + z2, FIX(1.328926049)); /* c2 */ 01192 tmp2 = MULTIPLY(z1, FIX(1.083350441)); /* c4 */ 01193 tmp3 = MULTIPLY(z2, FIX(0.245575608)); /* c8 */ 01194 01195 tmp10 = tmp1 + tmp0 - tmp3; 01196 tmp12 = tmp1 - tmp0 + tmp2; 01197 tmp13 = tmp1 - tmp2 + tmp3; 01198 01199 /* Odd part */ 01200 01201 z1 = (INT32) wsptr[1]; 01202 z2 = (INT32) wsptr[3]; 01203 z3 = (INT32) wsptr[5]; 01204 z4 = (INT32) wsptr[7]; 01205 01206 z2 = MULTIPLY(z2, - FIX(1.224744871)); /* -c3 */ 01207 01208 tmp2 = MULTIPLY(z1 + z3, FIX(0.909038955)); /* c5 */ 01209 tmp3 = MULTIPLY(z1 + z4, FIX(0.483689525)); /* c7 */ 01210 tmp0 = tmp2 + tmp3 - z2; 01211 tmp1 = MULTIPLY(z3 - z4, FIX(1.392728481)); /* c1 */ 01212 tmp2 += z2 - tmp1; 01213 tmp3 += z2 + tmp1; 01214 tmp1 = MULTIPLY(z1 - z3 - z4, FIX(1.224744871)); /* c3 */ 01215 01216 /* Final output stage */ 01217 01218 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp10 + tmp0, 01219 CONST_BITS+PASS1_BITS+3) 01220 & RANGE_MASK]; 01221 outptr[8] = range_limit[(int) RIGHT_SHIFT(tmp10 - tmp0, 01222 CONST_BITS+PASS1_BITS+3) 01223 & RANGE_MASK]; 01224 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp11 + tmp1, 01225 CONST_BITS+PASS1_BITS+3) 01226 & RANGE_MASK]; 01227 outptr[7] = range_limit[(int) RIGHT_SHIFT(tmp11 - tmp1, 01228 CONST_BITS+PASS1_BITS+3) 01229 & RANGE_MASK]; 01230 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp12 + tmp2, 01231 CONST_BITS+PASS1_BITS+3) 01232 & RANGE_MASK]; 01233 outptr[6] = range_limit[(int) RIGHT_SHIFT(tmp12 - tmp2, 01234 CONST_BITS+PASS1_BITS+3) 01235 & RANGE_MASK]; 01236 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp13 + tmp3, 01237 CONST_BITS+PASS1_BITS+3) 01238 & RANGE_MASK]; 01239 outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp13 - tmp3, 01240 CONST_BITS+PASS1_BITS+3) 01241 & RANGE_MASK]; 01242 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp14, 01243 CONST_BITS+PASS1_BITS+3) 01244 & RANGE_MASK]; 01245 01246 wsptr += 8; /* advance pointer to next row */ 01247 } 01248 } 01249 01250 01251 /* 01252 * Perform dequantization and inverse DCT on one block of coefficients, 01253 * producing a 10x10 output block. 01254 * 01255 * Optimized algorithm with 12 multiplications in the 1-D kernel. 01256 * cK represents sqrt(2) * cos(K*pi/20). 01257 */ 01258 01259 GLOBAL(void) 01260 jpeg_idct_10x10 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 01261 JCOEFPTR coef_block, 01262 JSAMPARRAY output_buf, JDIMENSION output_col) 01263 { 01264 INT32 tmp10, tmp11, tmp12, tmp13, tmp14; 01265 INT32 tmp20, tmp21, tmp22, tmp23, tmp24; 01266 INT32 z1, z2, z3, z4, z5; 01267 JCOEFPTR inptr; 01268 ISLOW_MULT_TYPE * quantptr; 01269 int * wsptr; 01270 JSAMPROW outptr; 01271 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 01272 int ctr; 01273 int workspace[8*10]; /* buffers data between passes */ 01274 SHIFT_TEMPS 01275 01276 /* Pass 1: process columns from input, store into work array. */ 01277 01278 inptr = coef_block; 01279 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 01280 wsptr = workspace; 01281 for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) { 01282 /* Even part */ 01283 01284 z3 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 01285 z3 <<= CONST_BITS; 01286 /* Add fudge factor here for final descale. */ 01287 z3 += ONE << (CONST_BITS-PASS1_BITS-1); 01288 z4 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 01289 z1 = MULTIPLY(z4, FIX(1.144122806)); /* c4 */ 01290 z2 = MULTIPLY(z4, FIX(0.437016024)); /* c8 */ 01291 tmp10 = z3 + z1; 01292 tmp11 = z3 - z2; 01293 01294 tmp22 = RIGHT_SHIFT(z3 - ((z1 - z2) << 1), /* c0 = (c4-c8)*2 */ 01295 CONST_BITS-PASS1_BITS); 01296 01297 z2 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 01298 z3 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]); 01299 01300 z1 = MULTIPLY(z2 + z3, FIX(0.831253876)); /* c6 */ 01301 tmp12 = z1 + MULTIPLY(z2, FIX(0.513743148)); /* c2-c6 */ 01302 tmp13 = z1 - MULTIPLY(z3, FIX(2.176250899)); /* c2+c6 */ 01303 01304 tmp20 = tmp10 + tmp12; 01305 tmp24 = tmp10 - tmp12; 01306 tmp21 = tmp11 + tmp13; 01307 tmp23 = tmp11 - tmp13; 01308 01309 /* Odd part */ 01310 01311 z1 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 01312 z2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 01313 z3 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]); 01314 z4 = DEQUANTIZE(inptr[DCTSIZE*7], quantptr[DCTSIZE*7]); 01315 01316 tmp11 = z2 + z4; 01317 tmp13 = z2 - z4; 01318 01319 tmp12 = MULTIPLY(tmp13, FIX(0.309016994)); /* (c3-c7)/2 */ 01320 z5 = z3 << CONST_BITS; 01321 01322 z2 = MULTIPLY(tmp11, FIX(0.951056516)); /* (c3+c7)/2 */ 01323 z4 = z5 + tmp12; 01324 01325 tmp10 = MULTIPLY(z1, FIX(1.396802247)) + z2 + z4; /* c1 */ 01326 tmp14 = MULTIPLY(z1, FIX(0.221231742)) - z2 + z4; /* c9 */ 01327 01328 z2 = MULTIPLY(tmp11, FIX(0.587785252)); /* (c1-c9)/2 */ 01329 z4 = z5 - tmp12 - (tmp13 << (CONST_BITS - 1)); 01330 01331 tmp12 = (z1 - tmp13 - z3) << PASS1_BITS; 01332 01333 tmp11 = MULTIPLY(z1, FIX(1.260073511)) - z2 - z4; /* c3 */ 01334 tmp13 = MULTIPLY(z1, FIX(0.642039522)) - z2 + z4; /* c7 */ 01335 01336 /* Final output stage */ 01337 01338 wsptr[8*0] = (int) RIGHT_SHIFT(tmp20 + tmp10, CONST_BITS-PASS1_BITS); 01339 wsptr[8*9] = (int) RIGHT_SHIFT(tmp20 - tmp10, CONST_BITS-PASS1_BITS); 01340 wsptr[8*1] = (int) RIGHT_SHIFT(tmp21 + tmp11, CONST_BITS-PASS1_BITS); 01341 wsptr[8*8] = (int) RIGHT_SHIFT(tmp21 - tmp11, CONST_BITS-PASS1_BITS); 01342 wsptr[8*2] = (int) (tmp22 + tmp12); 01343 wsptr[8*7] = (int) (tmp22 - tmp12); 01344 wsptr[8*3] = (int) RIGHT_SHIFT(tmp23 + tmp13, CONST_BITS-PASS1_BITS); 01345 wsptr[8*6] = (int) RIGHT_SHIFT(tmp23 - tmp13, CONST_BITS-PASS1_BITS); 01346 wsptr[8*4] = (int) RIGHT_SHIFT(tmp24 + tmp14, CONST_BITS-PASS1_BITS); 01347 wsptr[8*5] = (int) RIGHT_SHIFT(tmp24 - tmp14, CONST_BITS-PASS1_BITS); 01348 } 01349 01350 /* Pass 2: process 10 rows from work array, store into output array. */ 01351 01352 wsptr = workspace; 01353 for (ctr = 0; ctr < 10; ctr++) { 01354 outptr = output_buf[ctr] + output_col; 01355 01356 /* Even part */ 01357 01358 /* Add fudge factor here for final descale. */ 01359 z3 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 01360 z3 <<= CONST_BITS; 01361 z4 = (INT32) wsptr[4]; 01362 z1 = MULTIPLY(z4, FIX(1.144122806)); /* c4 */ 01363 z2 = MULTIPLY(z4, FIX(0.437016024)); /* c8 */ 01364 tmp10 = z3 + z1; 01365 tmp11 = z3 - z2; 01366 01367 tmp22 = z3 - ((z1 - z2) << 1); /* c0 = (c4-c8)*2 */ 01368 01369 z2 = (INT32) wsptr[2]; 01370 z3 = (INT32) wsptr[6]; 01371 01372 z1 = MULTIPLY(z2 + z3, FIX(0.831253876)); /* c6 */ 01373 tmp12 = z1 + MULTIPLY(z2, FIX(0.513743148)); /* c2-c6 */ 01374 tmp13 = z1 - MULTIPLY(z3, FIX(2.176250899)); /* c2+c6 */ 01375 01376 tmp20 = tmp10 + tmp12; 01377 tmp24 = tmp10 - tmp12; 01378 tmp21 = tmp11 + tmp13; 01379 tmp23 = tmp11 - tmp13; 01380 01381 /* Odd part */ 01382 01383 z1 = (INT32) wsptr[1]; 01384 z2 = (INT32) wsptr[3]; 01385 z3 = (INT32) wsptr[5]; 01386 z3 <<= CONST_BITS; 01387 z4 = (INT32) wsptr[7]; 01388 01389 tmp11 = z2 + z4; 01390 tmp13 = z2 - z4; 01391 01392 tmp12 = MULTIPLY(tmp13, FIX(0.309016994)); /* (c3-c7)/2 */ 01393 01394 z2 = MULTIPLY(tmp11, FIX(0.951056516)); /* (c3+c7)/2 */ 01395 z4 = z3 + tmp12; 01396 01397 tmp10 = MULTIPLY(z1, FIX(1.396802247)) + z2 + z4; /* c1 */ 01398 tmp14 = MULTIPLY(z1, FIX(0.221231742)) - z2 + z4; /* c9 */ 01399 01400 z2 = MULTIPLY(tmp11, FIX(0.587785252)); /* (c1-c9)/2 */ 01401 z4 = z3 - tmp12 - (tmp13 << (CONST_BITS - 1)); 01402 01403 tmp12 = ((z1 - tmp13) << CONST_BITS) - z3; 01404 01405 tmp11 = MULTIPLY(z1, FIX(1.260073511)) - z2 - z4; /* c3 */ 01406 tmp13 = MULTIPLY(z1, FIX(0.642039522)) - z2 + z4; /* c7 */ 01407 01408 /* Final output stage */ 01409 01410 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp20 + tmp10, 01411 CONST_BITS+PASS1_BITS+3) 01412 & RANGE_MASK]; 01413 outptr[9] = range_limit[(int) RIGHT_SHIFT(tmp20 - tmp10, 01414 CONST_BITS+PASS1_BITS+3) 01415 & RANGE_MASK]; 01416 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp21 + tmp11, 01417 CONST_BITS+PASS1_BITS+3) 01418 & RANGE_MASK]; 01419 outptr[8] = range_limit[(int) RIGHT_SHIFT(tmp21 - tmp11, 01420 CONST_BITS+PASS1_BITS+3) 01421 & RANGE_MASK]; 01422 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp22 + tmp12, 01423 CONST_BITS+PASS1_BITS+3) 01424 & RANGE_MASK]; 01425 outptr[7] = range_limit[(int) RIGHT_SHIFT(tmp22 - tmp12, 01426 CONST_BITS+PASS1_BITS+3) 01427 & RANGE_MASK]; 01428 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp23 + tmp13, 01429 CONST_BITS+PASS1_BITS+3) 01430 & RANGE_MASK]; 01431 outptr[6] = range_limit[(int) RIGHT_SHIFT(tmp23 - tmp13, 01432 CONST_BITS+PASS1_BITS+3) 01433 & RANGE_MASK]; 01434 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp24 + tmp14, 01435 CONST_BITS+PASS1_BITS+3) 01436 & RANGE_MASK]; 01437 outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp24 - tmp14, 01438 CONST_BITS+PASS1_BITS+3) 01439 & RANGE_MASK]; 01440 01441 wsptr += 8; /* advance pointer to next row */ 01442 } 01443 } 01444 01445 01446 /* 01447 * Perform dequantization and inverse DCT on one block of coefficients, 01448 * producing a 11x11 output block. 01449 * 01450 * Optimized algorithm with 24 multiplications in the 1-D kernel. 01451 * cK represents sqrt(2) * cos(K*pi/22). 01452 */ 01453 01454 GLOBAL(void) 01455 jpeg_idct_11x11 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 01456 JCOEFPTR coef_block, 01457 JSAMPARRAY output_buf, JDIMENSION output_col) 01458 { 01459 INT32 tmp10, tmp11, tmp12, tmp13, tmp14; 01460 INT32 tmp20, tmp21, tmp22, tmp23, tmp24, tmp25; 01461 INT32 z1, z2, z3, z4; 01462 JCOEFPTR inptr; 01463 ISLOW_MULT_TYPE * quantptr; 01464 int * wsptr; 01465 JSAMPROW outptr; 01466 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 01467 int ctr; 01468 int workspace[8*11]; /* buffers data between passes */ 01469 SHIFT_TEMPS 01470 01471 /* Pass 1: process columns from input, store into work array. */ 01472 01473 inptr = coef_block; 01474 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 01475 wsptr = workspace; 01476 for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) { 01477 /* Even part */ 01478 01479 tmp10 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 01480 tmp10 <<= CONST_BITS; 01481 /* Add fudge factor here for final descale. */ 01482 tmp10 += ONE << (CONST_BITS-PASS1_BITS-1); 01483 01484 z1 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 01485 z2 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 01486 z3 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]); 01487 01488 tmp20 = MULTIPLY(z2 - z3, FIX(2.546640132)); /* c2+c4 */ 01489 tmp23 = MULTIPLY(z2 - z1, FIX(0.430815045)); /* c2-c6 */ 01490 z4 = z1 + z3; 01491 tmp24 = MULTIPLY(z4, - FIX(1.155664402)); /* -(c2-c10) */ 01492 z4 -= z2; 01493 tmp25 = tmp10 + MULTIPLY(z4, FIX(1.356927976)); /* c2 */ 01494 tmp21 = tmp20 + tmp23 + tmp25 - 01495 MULTIPLY(z2, FIX(1.821790775)); /* c2+c4+c10-c6 */ 01496 tmp20 += tmp25 + MULTIPLY(z3, FIX(2.115825087)); /* c4+c6 */ 01497 tmp23 += tmp25 - MULTIPLY(z1, FIX(1.513598477)); /* c6+c8 */ 01498 tmp24 += tmp25; 01499 tmp22 = tmp24 - MULTIPLY(z3, FIX(0.788749120)); /* c8+c10 */ 01500 tmp24 += MULTIPLY(z2, FIX(1.944413522)) - /* c2+c8 */ 01501 MULTIPLY(z1, FIX(1.390975730)); /* c4+c10 */ 01502 tmp25 = tmp10 - MULTIPLY(z4, FIX(1.414213562)); /* c0 */ 01503 01504 /* Odd part */ 01505 01506 z1 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 01507 z2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 01508 z3 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]); 01509 z4 = DEQUANTIZE(inptr[DCTSIZE*7], quantptr[DCTSIZE*7]); 01510 01511 tmp11 = z1 + z2; 01512 tmp14 = MULTIPLY(tmp11 + z3 + z4, FIX(0.398430003)); /* c9 */ 01513 tmp11 = MULTIPLY(tmp11, FIX(0.887983902)); /* c3-c9 */ 01514 tmp12 = MULTIPLY(z1 + z3, FIX(0.670361295)); /* c5-c9 */ 01515 tmp13 = tmp14 + MULTIPLY(z1 + z4, FIX(0.366151574)); /* c7-c9 */ 01516 tmp10 = tmp11 + tmp12 + tmp13 - 01517 MULTIPLY(z1, FIX(0.923107866)); /* c7+c5+c3-c1-2*c9 */ 01518 z1 = tmp14 - MULTIPLY(z2 + z3, FIX(1.163011579)); /* c7+c9 */ 01519 tmp11 += z1 + MULTIPLY(z2, FIX(2.073276588)); /* c1+c7+3*c9-c3 */ 01520 tmp12 += z1 - MULTIPLY(z3, FIX(1.192193623)); /* c3+c5-c7-c9 */ 01521 z1 = MULTIPLY(z2 + z4, - FIX(1.798248910)); /* -(c1+c9) */ 01522 tmp11 += z1; 01523 tmp13 += z1 + MULTIPLY(z4, FIX(2.102458632)); /* c1+c5+c9-c7 */ 01524 tmp14 += MULTIPLY(z2, - FIX(1.467221301)) + /* -(c5+c9) */ 01525 MULTIPLY(z3, FIX(1.001388905)) - /* c1-c9 */ 01526 MULTIPLY(z4, FIX(1.684843907)); /* c3+c9 */ 01527 01528 /* Final output stage */ 01529 01530 wsptr[8*0] = (int) RIGHT_SHIFT(tmp20 + tmp10, CONST_BITS-PASS1_BITS); 01531 wsptr[8*10] = (int) RIGHT_SHIFT(tmp20 - tmp10, CONST_BITS-PASS1_BITS); 01532 wsptr[8*1] = (int) RIGHT_SHIFT(tmp21 + tmp11, CONST_BITS-PASS1_BITS); 01533 wsptr[8*9] = (int) RIGHT_SHIFT(tmp21 - tmp11, CONST_BITS-PASS1_BITS); 01534 wsptr[8*2] = (int) RIGHT_SHIFT(tmp22 + tmp12, CONST_BITS-PASS1_BITS); 01535 wsptr[8*8] = (int) RIGHT_SHIFT(tmp22 - tmp12, CONST_BITS-PASS1_BITS); 01536 wsptr[8*3] = (int) RIGHT_SHIFT(tmp23 + tmp13, CONST_BITS-PASS1_BITS); 01537 wsptr[8*7] = (int) RIGHT_SHIFT(tmp23 - tmp13, CONST_BITS-PASS1_BITS); 01538 wsptr[8*4] = (int) RIGHT_SHIFT(tmp24 + tmp14, CONST_BITS-PASS1_BITS); 01539 wsptr[8*6] = (int) RIGHT_SHIFT(tmp24 - tmp14, CONST_BITS-PASS1_BITS); 01540 wsptr[8*5] = (int) RIGHT_SHIFT(tmp25, CONST_BITS-PASS1_BITS); 01541 } 01542 01543 /* Pass 2: process 11 rows from work array, store into output array. */ 01544 01545 wsptr = workspace; 01546 for (ctr = 0; ctr < 11; ctr++) { 01547 outptr = output_buf[ctr] + output_col; 01548 01549 /* Even part */ 01550 01551 /* Add fudge factor here for final descale. */ 01552 tmp10 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 01553 tmp10 <<= CONST_BITS; 01554 01555 z1 = (INT32) wsptr[2]; 01556 z2 = (INT32) wsptr[4]; 01557 z3 = (INT32) wsptr[6]; 01558 01559 tmp20 = MULTIPLY(z2 - z3, FIX(2.546640132)); /* c2+c4 */ 01560 tmp23 = MULTIPLY(z2 - z1, FIX(0.430815045)); /* c2-c6 */ 01561 z4 = z1 + z3; 01562 tmp24 = MULTIPLY(z4, - FIX(1.155664402)); /* -(c2-c10) */ 01563 z4 -= z2; 01564 tmp25 = tmp10 + MULTIPLY(z4, FIX(1.356927976)); /* c2 */ 01565 tmp21 = tmp20 + tmp23 + tmp25 - 01566 MULTIPLY(z2, FIX(1.821790775)); /* c2+c4+c10-c6 */ 01567 tmp20 += tmp25 + MULTIPLY(z3, FIX(2.115825087)); /* c4+c6 */ 01568 tmp23 += tmp25 - MULTIPLY(z1, FIX(1.513598477)); /* c6+c8 */ 01569 tmp24 += tmp25; 01570 tmp22 = tmp24 - MULTIPLY(z3, FIX(0.788749120)); /* c8+c10 */ 01571 tmp24 += MULTIPLY(z2, FIX(1.944413522)) - /* c2+c8 */ 01572 MULTIPLY(z1, FIX(1.390975730)); /* c4+c10 */ 01573 tmp25 = tmp10 - MULTIPLY(z4, FIX(1.414213562)); /* c0 */ 01574 01575 /* Odd part */ 01576 01577 z1 = (INT32) wsptr[1]; 01578 z2 = (INT32) wsptr[3]; 01579 z3 = (INT32) wsptr[5]; 01580 z4 = (INT32) wsptr[7]; 01581 01582 tmp11 = z1 + z2; 01583 tmp14 = MULTIPLY(tmp11 + z3 + z4, FIX(0.398430003)); /* c9 */ 01584 tmp11 = MULTIPLY(tmp11, FIX(0.887983902)); /* c3-c9 */ 01585 tmp12 = MULTIPLY(z1 + z3, FIX(0.670361295)); /* c5-c9 */ 01586 tmp13 = tmp14 + MULTIPLY(z1 + z4, FIX(0.366151574)); /* c7-c9 */ 01587 tmp10 = tmp11 + tmp12 + tmp13 - 01588 MULTIPLY(z1, FIX(0.923107866)); /* c7+c5+c3-c1-2*c9 */ 01589 z1 = tmp14 - MULTIPLY(z2 + z3, FIX(1.163011579)); /* c7+c9 */ 01590 tmp11 += z1 + MULTIPLY(z2, FIX(2.073276588)); /* c1+c7+3*c9-c3 */ 01591 tmp12 += z1 - MULTIPLY(z3, FIX(1.192193623)); /* c3+c5-c7-c9 */ 01592 z1 = MULTIPLY(z2 + z4, - FIX(1.798248910)); /* -(c1+c9) */ 01593 tmp11 += z1; 01594 tmp13 += z1 + MULTIPLY(z4, FIX(2.102458632)); /* c1+c5+c9-c7 */ 01595 tmp14 += MULTIPLY(z2, - FIX(1.467221301)) + /* -(c5+c9) */ 01596 MULTIPLY(z3, FIX(1.001388905)) - /* c1-c9 */ 01597 MULTIPLY(z4, FIX(1.684843907)); /* c3+c9 */ 01598 01599 /* Final output stage */ 01600 01601 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp20 + tmp10, 01602 CONST_BITS+PASS1_BITS+3) 01603 & RANGE_MASK]; 01604 outptr[10] = range_limit[(int) RIGHT_SHIFT(tmp20 - tmp10, 01605 CONST_BITS+PASS1_BITS+3) 01606 & RANGE_MASK]; 01607 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp21 + tmp11, 01608 CONST_BITS+PASS1_BITS+3) 01609 & RANGE_MASK]; 01610 outptr[9] = range_limit[(int) RIGHT_SHIFT(tmp21 - tmp11, 01611 CONST_BITS+PASS1_BITS+3) 01612 & RANGE_MASK]; 01613 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp22 + tmp12, 01614 CONST_BITS+PASS1_BITS+3) 01615 & RANGE_MASK]; 01616 outptr[8] = range_limit[(int) RIGHT_SHIFT(tmp22 - tmp12, 01617 CONST_BITS+PASS1_BITS+3) 01618 & RANGE_MASK]; 01619 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp23 + tmp13, 01620 CONST_BITS+PASS1_BITS+3) 01621 & RANGE_MASK]; 01622 outptr[7] = range_limit[(int) RIGHT_SHIFT(tmp23 - tmp13, 01623 CONST_BITS+PASS1_BITS+3) 01624 & RANGE_MASK]; 01625 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp24 + tmp14, 01626 CONST_BITS+PASS1_BITS+3) 01627 & RANGE_MASK]; 01628 outptr[6] = range_limit[(int) RIGHT_SHIFT(tmp24 - tmp14, 01629 CONST_BITS+PASS1_BITS+3) 01630 & RANGE_MASK]; 01631 outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp25, 01632 CONST_BITS+PASS1_BITS+3) 01633 & RANGE_MASK]; 01634 01635 wsptr += 8; /* advance pointer to next row */ 01636 } 01637 } 01638 01639 01640 /* 01641 * Perform dequantization and inverse DCT on one block of coefficients, 01642 * producing a 12x12 output block. 01643 * 01644 * Optimized algorithm with 15 multiplications in the 1-D kernel. 01645 * cK represents sqrt(2) * cos(K*pi/24). 01646 */ 01647 01648 GLOBAL(void) 01649 jpeg_idct_12x12 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 01650 JCOEFPTR coef_block, 01651 JSAMPARRAY output_buf, JDIMENSION output_col) 01652 { 01653 INT32 tmp10, tmp11, tmp12, tmp13, tmp14, tmp15; 01654 INT32 tmp20, tmp21, tmp22, tmp23, tmp24, tmp25; 01655 INT32 z1, z2, z3, z4; 01656 JCOEFPTR inptr; 01657 ISLOW_MULT_TYPE * quantptr; 01658 int * wsptr; 01659 JSAMPROW outptr; 01660 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 01661 int ctr; 01662 int workspace[8*12]; /* buffers data between passes */ 01663 SHIFT_TEMPS 01664 01665 /* Pass 1: process columns from input, store into work array. */ 01666 01667 inptr = coef_block; 01668 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 01669 wsptr = workspace; 01670 for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) { 01671 /* Even part */ 01672 01673 z3 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 01674 z3 <<= CONST_BITS; 01675 /* Add fudge factor here for final descale. */ 01676 z3 += ONE << (CONST_BITS-PASS1_BITS-1); 01677 01678 z4 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 01679 z4 = MULTIPLY(z4, FIX(1.224744871)); /* c4 */ 01680 01681 tmp10 = z3 + z4; 01682 tmp11 = z3 - z4; 01683 01684 z1 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 01685 z4 = MULTIPLY(z1, FIX(1.366025404)); /* c2 */ 01686 z1 <<= CONST_BITS; 01687 z2 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]); 01688 z2 <<= CONST_BITS; 01689 01690 tmp12 = z1 - z2; 01691 01692 tmp21 = z3 + tmp12; 01693 tmp24 = z3 - tmp12; 01694 01695 tmp12 = z4 + z2; 01696 01697 tmp20 = tmp10 + tmp12; 01698 tmp25 = tmp10 - tmp12; 01699 01700 tmp12 = z4 - z1 - z2; 01701 01702 tmp22 = tmp11 + tmp12; 01703 tmp23 = tmp11 - tmp12; 01704 01705 /* Odd part */ 01706 01707 z1 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 01708 z2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 01709 z3 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]); 01710 z4 = DEQUANTIZE(inptr[DCTSIZE*7], quantptr[DCTSIZE*7]); 01711 01712 tmp11 = MULTIPLY(z2, FIX(1.306562965)); /* c3 */ 01713 tmp14 = MULTIPLY(z2, - FIX_0_541196100); /* -c9 */ 01714 01715 tmp10 = z1 + z3; 01716 tmp15 = MULTIPLY(tmp10 + z4, FIX(0.860918669)); /* c7 */ 01717 tmp12 = tmp15 + MULTIPLY(tmp10, FIX(0.261052384)); /* c5-c7 */ 01718 tmp10 = tmp12 + tmp11 + MULTIPLY(z1, FIX(0.280143716)); /* c1-c5 */ 01719 tmp13 = MULTIPLY(z3 + z4, - FIX(1.045510580)); /* -(c7+c11) */ 01720 tmp12 += tmp13 + tmp14 - MULTIPLY(z3, FIX(1.478575242)); /* c1+c5-c7-c11 */ 01721 tmp13 += tmp15 - tmp11 + MULTIPLY(z4, FIX(1.586706681)); /* c1+c11 */ 01722 tmp15 += tmp14 - MULTIPLY(z1, FIX(0.676326758)) - /* c7-c11 */ 01723 MULTIPLY(z4, FIX(1.982889723)); /* c5+c7 */ 01724 01725 z1 -= z4; 01726 z2 -= z3; 01727 z3 = MULTIPLY(z1 + z2, FIX_0_541196100); /* c9 */ 01728 tmp11 = z3 + MULTIPLY(z1, FIX_0_765366865); /* c3-c9 */ 01729 tmp14 = z3 - MULTIPLY(z2, FIX_1_847759065); /* c3+c9 */ 01730 01731 /* Final output stage */ 01732 01733 wsptr[8*0] = (int) RIGHT_SHIFT(tmp20 + tmp10, CONST_BITS-PASS1_BITS); 01734 wsptr[8*11] = (int) RIGHT_SHIFT(tmp20 - tmp10, CONST_BITS-PASS1_BITS); 01735 wsptr[8*1] = (int) RIGHT_SHIFT(tmp21 + tmp11, CONST_BITS-PASS1_BITS); 01736 wsptr[8*10] = (int) RIGHT_SHIFT(tmp21 - tmp11, CONST_BITS-PASS1_BITS); 01737 wsptr[8*2] = (int) RIGHT_SHIFT(tmp22 + tmp12, CONST_BITS-PASS1_BITS); 01738 wsptr[8*9] = (int) RIGHT_SHIFT(tmp22 - tmp12, CONST_BITS-PASS1_BITS); 01739 wsptr[8*3] = (int) RIGHT_SHIFT(tmp23 + tmp13, CONST_BITS-PASS1_BITS); 01740 wsptr[8*8] = (int) RIGHT_SHIFT(tmp23 - tmp13, CONST_BITS-PASS1_BITS); 01741 wsptr[8*4] = (int) RIGHT_SHIFT(tmp24 + tmp14, CONST_BITS-PASS1_BITS); 01742 wsptr[8*7] = (int) RIGHT_SHIFT(tmp24 - tmp14, CONST_BITS-PASS1_BITS); 01743 wsptr[8*5] = (int) RIGHT_SHIFT(tmp25 + tmp15, CONST_BITS-PASS1_BITS); 01744 wsptr[8*6] = (int) RIGHT_SHIFT(tmp25 - tmp15, CONST_BITS-PASS1_BITS); 01745 } 01746 01747 /* Pass 2: process 12 rows from work array, store into output array. */ 01748 01749 wsptr = workspace; 01750 for (ctr = 0; ctr < 12; ctr++) { 01751 outptr = output_buf[ctr] + output_col; 01752 01753 /* Even part */ 01754 01755 /* Add fudge factor here for final descale. */ 01756 z3 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 01757 z3 <<= CONST_BITS; 01758 01759 z4 = (INT32) wsptr[4]; 01760 z4 = MULTIPLY(z4, FIX(1.224744871)); /* c4 */ 01761 01762 tmp10 = z3 + z4; 01763 tmp11 = z3 - z4; 01764 01765 z1 = (INT32) wsptr[2]; 01766 z4 = MULTIPLY(z1, FIX(1.366025404)); /* c2 */ 01767 z1 <<= CONST_BITS; 01768 z2 = (INT32) wsptr[6]; 01769 z2 <<= CONST_BITS; 01770 01771 tmp12 = z1 - z2; 01772 01773 tmp21 = z3 + tmp12; 01774 tmp24 = z3 - tmp12; 01775 01776 tmp12 = z4 + z2; 01777 01778 tmp20 = tmp10 + tmp12; 01779 tmp25 = tmp10 - tmp12; 01780 01781 tmp12 = z4 - z1 - z2; 01782 01783 tmp22 = tmp11 + tmp12; 01784 tmp23 = tmp11 - tmp12; 01785 01786 /* Odd part */ 01787 01788 z1 = (INT32) wsptr[1]; 01789 z2 = (INT32) wsptr[3]; 01790 z3 = (INT32) wsptr[5]; 01791 z4 = (INT32) wsptr[7]; 01792 01793 tmp11 = MULTIPLY(z2, FIX(1.306562965)); /* c3 */ 01794 tmp14 = MULTIPLY(z2, - FIX_0_541196100); /* -c9 */ 01795 01796 tmp10 = z1 + z3; 01797 tmp15 = MULTIPLY(tmp10 + z4, FIX(0.860918669)); /* c7 */ 01798 tmp12 = tmp15 + MULTIPLY(tmp10, FIX(0.261052384)); /* c5-c7 */ 01799 tmp10 = tmp12 + tmp11 + MULTIPLY(z1, FIX(0.280143716)); /* c1-c5 */ 01800 tmp13 = MULTIPLY(z3 + z4, - FIX(1.045510580)); /* -(c7+c11) */ 01801 tmp12 += tmp13 + tmp14 - MULTIPLY(z3, FIX(1.478575242)); /* c1+c5-c7-c11 */ 01802 tmp13 += tmp15 - tmp11 + MULTIPLY(z4, FIX(1.586706681)); /* c1+c11 */ 01803 tmp15 += tmp14 - MULTIPLY(z1, FIX(0.676326758)) - /* c7-c11 */ 01804 MULTIPLY(z4, FIX(1.982889723)); /* c5+c7 */ 01805 01806 z1 -= z4; 01807 z2 -= z3; 01808 z3 = MULTIPLY(z1 + z2, FIX_0_541196100); /* c9 */ 01809 tmp11 = z3 + MULTIPLY(z1, FIX_0_765366865); /* c3-c9 */ 01810 tmp14 = z3 - MULTIPLY(z2, FIX_1_847759065); /* c3+c9 */ 01811 01812 /* Final output stage */ 01813 01814 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp20 + tmp10, 01815 CONST_BITS+PASS1_BITS+3) 01816 & RANGE_MASK]; 01817 outptr[11] = range_limit[(int) RIGHT_SHIFT(tmp20 - tmp10, 01818 CONST_BITS+PASS1_BITS+3) 01819 & RANGE_MASK]; 01820 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp21 + tmp11, 01821 CONST_BITS+PASS1_BITS+3) 01822 & RANGE_MASK]; 01823 outptr[10] = range_limit[(int) RIGHT_SHIFT(tmp21 - tmp11, 01824 CONST_BITS+PASS1_BITS+3) 01825 & RANGE_MASK]; 01826 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp22 + tmp12, 01827 CONST_BITS+PASS1_BITS+3) 01828 & RANGE_MASK]; 01829 outptr[9] = range_limit[(int) RIGHT_SHIFT(tmp22 - tmp12, 01830 CONST_BITS+PASS1_BITS+3) 01831 & RANGE_MASK]; 01832 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp23 + tmp13, 01833 CONST_BITS+PASS1_BITS+3) 01834 & RANGE_MASK]; 01835 outptr[8] = range_limit[(int) RIGHT_SHIFT(tmp23 - tmp13, 01836 CONST_BITS+PASS1_BITS+3) 01837 & RANGE_MASK]; 01838 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp24 + tmp14, 01839 CONST_BITS+PASS1_BITS+3) 01840 & RANGE_MASK]; 01841 outptr[7] = range_limit[(int) RIGHT_SHIFT(tmp24 - tmp14, 01842 CONST_BITS+PASS1_BITS+3) 01843 & RANGE_MASK]; 01844 outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp25 + tmp15, 01845 CONST_BITS+PASS1_BITS+3) 01846 & RANGE_MASK]; 01847 outptr[6] = range_limit[(int) RIGHT_SHIFT(tmp25 - tmp15, 01848 CONST_BITS+PASS1_BITS+3) 01849 & RANGE_MASK]; 01850 01851 wsptr += 8; /* advance pointer to next row */ 01852 } 01853 } 01854 01855 01856 /* 01857 * Perform dequantization and inverse DCT on one block of coefficients, 01858 * producing a 13x13 output block. 01859 * 01860 * Optimized algorithm with 29 multiplications in the 1-D kernel. 01861 * cK represents sqrt(2) * cos(K*pi/26). 01862 */ 01863 01864 GLOBAL(void) 01865 jpeg_idct_13x13 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 01866 JCOEFPTR coef_block, 01867 JSAMPARRAY output_buf, JDIMENSION output_col) 01868 { 01869 INT32 tmp10, tmp11, tmp12, tmp13, tmp14, tmp15; 01870 INT32 tmp20, tmp21, tmp22, tmp23, tmp24, tmp25, tmp26; 01871 INT32 z1, z2, z3, z4; 01872 JCOEFPTR inptr; 01873 ISLOW_MULT_TYPE * quantptr; 01874 int * wsptr; 01875 JSAMPROW outptr; 01876 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 01877 int ctr; 01878 int workspace[8*13]; /* buffers data between passes */ 01879 SHIFT_TEMPS 01880 01881 /* Pass 1: process columns from input, store into work array. */ 01882 01883 inptr = coef_block; 01884 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 01885 wsptr = workspace; 01886 for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) { 01887 /* Even part */ 01888 01889 z1 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 01890 z1 <<= CONST_BITS; 01891 /* Add fudge factor here for final descale. */ 01892 z1 += ONE << (CONST_BITS-PASS1_BITS-1); 01893 01894 z2 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 01895 z3 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 01896 z4 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]); 01897 01898 tmp10 = z3 + z4; 01899 tmp11 = z3 - z4; 01900 01901 tmp12 = MULTIPLY(tmp10, FIX(1.155388986)); /* (c4+c6)/2 */ 01902 tmp13 = MULTIPLY(tmp11, FIX(0.096834934)) + z1; /* (c4-c6)/2 */ 01903 01904 tmp20 = MULTIPLY(z2, FIX(1.373119086)) + tmp12 + tmp13; /* c2 */ 01905 tmp22 = MULTIPLY(z2, FIX(0.501487041)) - tmp12 + tmp13; /* c10 */ 01906 01907 tmp12 = MULTIPLY(tmp10, FIX(0.316450131)); /* (c8-c12)/2 */ 01908 tmp13 = MULTIPLY(tmp11, FIX(0.486914739)) + z1; /* (c8+c12)/2 */ 01909 01910 tmp21 = MULTIPLY(z2, FIX(1.058554052)) - tmp12 + tmp13; /* c6 */ 01911 tmp25 = MULTIPLY(z2, - FIX(1.252223920)) + tmp12 + tmp13; /* c4 */ 01912 01913 tmp12 = MULTIPLY(tmp10, FIX(0.435816023)); /* (c2-c10)/2 */ 01914 tmp13 = MULTIPLY(tmp11, FIX(0.937303064)) - z1; /* (c2+c10)/2 */ 01915 01916 tmp23 = MULTIPLY(z2, - FIX(0.170464608)) - tmp12 - tmp13; /* c12 */ 01917 tmp24 = MULTIPLY(z2, - FIX(0.803364869)) + tmp12 - tmp13; /* c8 */ 01918 01919 tmp26 = MULTIPLY(tmp11 - z2, FIX(1.414213562)) + z1; /* c0 */ 01920 01921 /* Odd part */ 01922 01923 z1 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 01924 z2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 01925 z3 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]); 01926 z4 = DEQUANTIZE(inptr[DCTSIZE*7], quantptr[DCTSIZE*7]); 01927 01928 tmp11 = MULTIPLY(z1 + z2, FIX(1.322312651)); /* c3 */ 01929 tmp12 = MULTIPLY(z1 + z3, FIX(1.163874945)); /* c5 */ 01930 tmp15 = z1 + z4; 01931 tmp13 = MULTIPLY(tmp15, FIX(0.937797057)); /* c7 */ 01932 tmp10 = tmp11 + tmp12 + tmp13 - 01933 MULTIPLY(z1, FIX(2.020082300)); /* c7+c5+c3-c1 */ 01934 tmp14 = MULTIPLY(z2 + z3, - FIX(0.338443458)); /* -c11 */ 01935 tmp11 += tmp14 + MULTIPLY(z2, FIX(0.837223564)); /* c5+c9+c11-c3 */ 01936 tmp12 += tmp14 - MULTIPLY(z3, FIX(1.572116027)); /* c1+c5-c9-c11 */ 01937 tmp14 = MULTIPLY(z2 + z4, - FIX(1.163874945)); /* -c5 */ 01938 tmp11 += tmp14; 01939 tmp13 += tmp14 + MULTIPLY(z4, FIX(2.205608352)); /* c3+c5+c9-c7 */ 01940 tmp14 = MULTIPLY(z3 + z4, - FIX(0.657217813)); /* -c9 */ 01941 tmp12 += tmp14; 01942 tmp13 += tmp14; 01943 tmp15 = MULTIPLY(tmp15, FIX(0.338443458)); /* c11 */ 01944 tmp14 = tmp15 + MULTIPLY(z1, FIX(0.318774355)) - /* c9-c11 */ 01945 MULTIPLY(z2, FIX(0.466105296)); /* c1-c7 */ 01946 z1 = MULTIPLY(z3 - z2, FIX(0.937797057)); /* c7 */ 01947 tmp14 += z1; 01948 tmp15 += z1 + MULTIPLY(z3, FIX(0.384515595)) - /* c3-c7 */ 01949 MULTIPLY(z4, FIX(1.742345811)); /* c1+c11 */ 01950 01951 /* Final output stage */ 01952 01953 wsptr[8*0] = (int) RIGHT_SHIFT(tmp20 + tmp10, CONST_BITS-PASS1_BITS); 01954 wsptr[8*12] = (int) RIGHT_SHIFT(tmp20 - tmp10, CONST_BITS-PASS1_BITS); 01955 wsptr[8*1] = (int) RIGHT_SHIFT(tmp21 + tmp11, CONST_BITS-PASS1_BITS); 01956 wsptr[8*11] = (int) RIGHT_SHIFT(tmp21 - tmp11, CONST_BITS-PASS1_BITS); 01957 wsptr[8*2] = (int) RIGHT_SHIFT(tmp22 + tmp12, CONST_BITS-PASS1_BITS); 01958 wsptr[8*10] = (int) RIGHT_SHIFT(tmp22 - tmp12, CONST_BITS-PASS1_BITS); 01959 wsptr[8*3] = (int) RIGHT_SHIFT(tmp23 + tmp13, CONST_BITS-PASS1_BITS); 01960 wsptr[8*9] = (int) RIGHT_SHIFT(tmp23 - tmp13, CONST_BITS-PASS1_BITS); 01961 wsptr[8*4] = (int) RIGHT_SHIFT(tmp24 + tmp14, CONST_BITS-PASS1_BITS); 01962 wsptr[8*8] = (int) RIGHT_SHIFT(tmp24 - tmp14, CONST_BITS-PASS1_BITS); 01963 wsptr[8*5] = (int) RIGHT_SHIFT(tmp25 + tmp15, CONST_BITS-PASS1_BITS); 01964 wsptr[8*7] = (int) RIGHT_SHIFT(tmp25 - tmp15, CONST_BITS-PASS1_BITS); 01965 wsptr[8*6] = (int) RIGHT_SHIFT(tmp26, CONST_BITS-PASS1_BITS); 01966 } 01967 01968 /* Pass 2: process 13 rows from work array, store into output array. */ 01969 01970 wsptr = workspace; 01971 for (ctr = 0; ctr < 13; ctr++) { 01972 outptr = output_buf[ctr] + output_col; 01973 01974 /* Even part */ 01975 01976 /* Add fudge factor here for final descale. */ 01977 z1 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 01978 z1 <<= CONST_BITS; 01979 01980 z2 = (INT32) wsptr[2]; 01981 z3 = (INT32) wsptr[4]; 01982 z4 = (INT32) wsptr[6]; 01983 01984 tmp10 = z3 + z4; 01985 tmp11 = z3 - z4; 01986 01987 tmp12 = MULTIPLY(tmp10, FIX(1.155388986)); /* (c4+c6)/2 */ 01988 tmp13 = MULTIPLY(tmp11, FIX(0.096834934)) + z1; /* (c4-c6)/2 */ 01989 01990 tmp20 = MULTIPLY(z2, FIX(1.373119086)) + tmp12 + tmp13; /* c2 */ 01991 tmp22 = MULTIPLY(z2, FIX(0.501487041)) - tmp12 + tmp13; /* c10 */ 01992 01993 tmp12 = MULTIPLY(tmp10, FIX(0.316450131)); /* (c8-c12)/2 */ 01994 tmp13 = MULTIPLY(tmp11, FIX(0.486914739)) + z1; /* (c8+c12)/2 */ 01995 01996 tmp21 = MULTIPLY(z2, FIX(1.058554052)) - tmp12 + tmp13; /* c6 */ 01997 tmp25 = MULTIPLY(z2, - FIX(1.252223920)) + tmp12 + tmp13; /* c4 */ 01998 01999 tmp12 = MULTIPLY(tmp10, FIX(0.435816023)); /* (c2-c10)/2 */ 02000 tmp13 = MULTIPLY(tmp11, FIX(0.937303064)) - z1; /* (c2+c10)/2 */ 02001 02002 tmp23 = MULTIPLY(z2, - FIX(0.170464608)) - tmp12 - tmp13; /* c12 */ 02003 tmp24 = MULTIPLY(z2, - FIX(0.803364869)) + tmp12 - tmp13; /* c8 */ 02004 02005 tmp26 = MULTIPLY(tmp11 - z2, FIX(1.414213562)) + z1; /* c0 */ 02006 02007 /* Odd part */ 02008 02009 z1 = (INT32) wsptr[1]; 02010 z2 = (INT32) wsptr[3]; 02011 z3 = (INT32) wsptr[5]; 02012 z4 = (INT32) wsptr[7]; 02013 02014 tmp11 = MULTIPLY(z1 + z2, FIX(1.322312651)); /* c3 */ 02015 tmp12 = MULTIPLY(z1 + z3, FIX(1.163874945)); /* c5 */ 02016 tmp15 = z1 + z4; 02017 tmp13 = MULTIPLY(tmp15, FIX(0.937797057)); /* c7 */ 02018 tmp10 = tmp11 + tmp12 + tmp13 - 02019 MULTIPLY(z1, FIX(2.020082300)); /* c7+c5+c3-c1 */ 02020 tmp14 = MULTIPLY(z2 + z3, - FIX(0.338443458)); /* -c11 */ 02021 tmp11 += tmp14 + MULTIPLY(z2, FIX(0.837223564)); /* c5+c9+c11-c3 */ 02022 tmp12 += tmp14 - MULTIPLY(z3, FIX(1.572116027)); /* c1+c5-c9-c11 */ 02023 tmp14 = MULTIPLY(z2 + z4, - FIX(1.163874945)); /* -c5 */ 02024 tmp11 += tmp14; 02025 tmp13 += tmp14 + MULTIPLY(z4, FIX(2.205608352)); /* c3+c5+c9-c7 */ 02026 tmp14 = MULTIPLY(z3 + z4, - FIX(0.657217813)); /* -c9 */ 02027 tmp12 += tmp14; 02028 tmp13 += tmp14; 02029 tmp15 = MULTIPLY(tmp15, FIX(0.338443458)); /* c11 */ 02030 tmp14 = tmp15 + MULTIPLY(z1, FIX(0.318774355)) - /* c9-c11 */ 02031 MULTIPLY(z2, FIX(0.466105296)); /* c1-c7 */ 02032 z1 = MULTIPLY(z3 - z2, FIX(0.937797057)); /* c7 */ 02033 tmp14 += z1; 02034 tmp15 += z1 + MULTIPLY(z3, FIX(0.384515595)) - /* c3-c7 */ 02035 MULTIPLY(z4, FIX(1.742345811)); /* c1+c11 */ 02036 02037 /* Final output stage */ 02038 02039 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp20 + tmp10, 02040 CONST_BITS+PASS1_BITS+3) 02041 & RANGE_MASK]; 02042 outptr[12] = range_limit[(int) RIGHT_SHIFT(tmp20 - tmp10, 02043 CONST_BITS+PASS1_BITS+3) 02044 & RANGE_MASK]; 02045 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp21 + tmp11, 02046 CONST_BITS+PASS1_BITS+3) 02047 & RANGE_MASK]; 02048 outptr[11] = range_limit[(int) RIGHT_SHIFT(tmp21 - tmp11, 02049 CONST_BITS+PASS1_BITS+3) 02050 & RANGE_MASK]; 02051 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp22 + tmp12, 02052 CONST_BITS+PASS1_BITS+3) 02053 & RANGE_MASK]; 02054 outptr[10] = range_limit[(int) RIGHT_SHIFT(tmp22 - tmp12, 02055 CONST_BITS+PASS1_BITS+3) 02056 & RANGE_MASK]; 02057 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp23 + tmp13, 02058 CONST_BITS+PASS1_BITS+3) 02059 & RANGE_MASK]; 02060 outptr[9] = range_limit[(int) RIGHT_SHIFT(tmp23 - tmp13, 02061 CONST_BITS+PASS1_BITS+3) 02062 & RANGE_MASK]; 02063 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp24 + tmp14, 02064 CONST_BITS+PASS1_BITS+3) 02065 & RANGE_MASK]; 02066 outptr[8] = range_limit[(int) RIGHT_SHIFT(tmp24 - tmp14, 02067 CONST_BITS+PASS1_BITS+3) 02068 & RANGE_MASK]; 02069 outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp25 + tmp15, 02070 CONST_BITS+PASS1_BITS+3) 02071 & RANGE_MASK]; 02072 outptr[7] = range_limit[(int) RIGHT_SHIFT(tmp25 - tmp15, 02073 CONST_BITS+PASS1_BITS+3) 02074 & RANGE_MASK]; 02075 outptr[6] = range_limit[(int) RIGHT_SHIFT(tmp26, 02076 CONST_BITS+PASS1_BITS+3) 02077 & RANGE_MASK]; 02078 02079 wsptr += 8; /* advance pointer to next row */ 02080 } 02081 } 02082 02083 02084 /* 02085 * Perform dequantization and inverse DCT on one block of coefficients, 02086 * producing a 14x14 output block. 02087 * 02088 * Optimized algorithm with 20 multiplications in the 1-D kernel. 02089 * cK represents sqrt(2) * cos(K*pi/28). 02090 */ 02091 02092 GLOBAL(void) 02093 jpeg_idct_14x14 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 02094 JCOEFPTR coef_block, 02095 JSAMPARRAY output_buf, JDIMENSION output_col) 02096 { 02097 INT32 tmp10, tmp11, tmp12, tmp13, tmp14, tmp15, tmp16; 02098 INT32 tmp20, tmp21, tmp22, tmp23, tmp24, tmp25, tmp26; 02099 INT32 z1, z2, z3, z4; 02100 JCOEFPTR inptr; 02101 ISLOW_MULT_TYPE * quantptr; 02102 int * wsptr; 02103 JSAMPROW outptr; 02104 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 02105 int ctr; 02106 int workspace[8*14]; /* buffers data between passes */ 02107 SHIFT_TEMPS 02108 02109 /* Pass 1: process columns from input, store into work array. */ 02110 02111 inptr = coef_block; 02112 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 02113 wsptr = workspace; 02114 for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) { 02115 /* Even part */ 02116 02117 z1 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 02118 z1 <<= CONST_BITS; 02119 /* Add fudge factor here for final descale. */ 02120 z1 += ONE << (CONST_BITS-PASS1_BITS-1); 02121 z4 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 02122 z2 = MULTIPLY(z4, FIX(1.274162392)); /* c4 */ 02123 z3 = MULTIPLY(z4, FIX(0.314692123)); /* c12 */ 02124 z4 = MULTIPLY(z4, FIX(0.881747734)); /* c8 */ 02125 02126 tmp10 = z1 + z2; 02127 tmp11 = z1 + z3; 02128 tmp12 = z1 - z4; 02129 02130 tmp23 = RIGHT_SHIFT(z1 - ((z2 + z3 - z4) << 1), /* c0 = (c4+c12-c8)*2 */ 02131 CONST_BITS-PASS1_BITS); 02132 02133 z1 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 02134 z2 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]); 02135 02136 z3 = MULTIPLY(z1 + z2, FIX(1.105676686)); /* c6 */ 02137 02138 tmp13 = z3 + MULTIPLY(z1, FIX(0.273079590)); /* c2-c6 */ 02139 tmp14 = z3 - MULTIPLY(z2, FIX(1.719280954)); /* c6+c10 */ 02140 tmp15 = MULTIPLY(z1, FIX(0.613604268)) - /* c10 */ 02141 MULTIPLY(z2, FIX(1.378756276)); /* c2 */ 02142 02143 tmp20 = tmp10 + tmp13; 02144 tmp26 = tmp10 - tmp13; 02145 tmp21 = tmp11 + tmp14; 02146 tmp25 = tmp11 - tmp14; 02147 tmp22 = tmp12 + tmp15; 02148 tmp24 = tmp12 - tmp15; 02149 02150 /* Odd part */ 02151 02152 z1 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 02153 z2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 02154 z3 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]); 02155 z4 = DEQUANTIZE(inptr[DCTSIZE*7], quantptr[DCTSIZE*7]); 02156 tmp13 = z4 << CONST_BITS; 02157 02158 tmp14 = z1 + z3; 02159 tmp11 = MULTIPLY(z1 + z2, FIX(1.334852607)); /* c3 */ 02160 tmp12 = MULTIPLY(tmp14, FIX(1.197448846)); /* c5 */ 02161 tmp10 = tmp11 + tmp12 + tmp13 - MULTIPLY(z1, FIX(1.126980169)); /* c3+c5-c1 */ 02162 tmp14 = MULTIPLY(tmp14, FIX(0.752406978)); /* c9 */ 02163 tmp16 = tmp14 - MULTIPLY(z1, FIX(1.061150426)); /* c9+c11-c13 */ 02164 z1 -= z2; 02165 tmp15 = MULTIPLY(z1, FIX(0.467085129)) - tmp13; /* c11 */ 02166 tmp16 += tmp15; 02167 z1 += z4; 02168 z4 = MULTIPLY(z2 + z3, - FIX(0.158341681)) - tmp13; /* -c13 */ 02169 tmp11 += z4 - MULTIPLY(z2, FIX(0.424103948)); /* c3-c9-c13 */ 02170 tmp12 += z4 - MULTIPLY(z3, FIX(2.373959773)); /* c3+c5-c13 */ 02171 z4 = MULTIPLY(z3 - z2, FIX(1.405321284)); /* c1 */ 02172 tmp14 += z4 + tmp13 - MULTIPLY(z3, FIX(1.6906431334)); /* c1+c9-c11 */ 02173 tmp15 += z4 + MULTIPLY(z2, FIX(0.674957567)); /* c1+c11-c5 */ 02174 02175 tmp13 = (z1 - z3) << PASS1_BITS; 02176 02177 /* Final output stage */ 02178 02179 wsptr[8*0] = (int) RIGHT_SHIFT(tmp20 + tmp10, CONST_BITS-PASS1_BITS); 02180 wsptr[8*13] = (int) RIGHT_SHIFT(tmp20 - tmp10, CONST_BITS-PASS1_BITS); 02181 wsptr[8*1] = (int) RIGHT_SHIFT(tmp21 + tmp11, CONST_BITS-PASS1_BITS); 02182 wsptr[8*12] = (int) RIGHT_SHIFT(tmp21 - tmp11, CONST_BITS-PASS1_BITS); 02183 wsptr[8*2] = (int) RIGHT_SHIFT(tmp22 + tmp12, CONST_BITS-PASS1_BITS); 02184 wsptr[8*11] = (int) RIGHT_SHIFT(tmp22 - tmp12, CONST_BITS-PASS1_BITS); 02185 wsptr[8*3] = (int) (tmp23 + tmp13); 02186 wsptr[8*10] = (int) (tmp23 - tmp13); 02187 wsptr[8*4] = (int) RIGHT_SHIFT(tmp24 + tmp14, CONST_BITS-PASS1_BITS); 02188 wsptr[8*9] = (int) RIGHT_SHIFT(tmp24 - tmp14, CONST_BITS-PASS1_BITS); 02189 wsptr[8*5] = (int) RIGHT_SHIFT(tmp25 + tmp15, CONST_BITS-PASS1_BITS); 02190 wsptr[8*8] = (int) RIGHT_SHIFT(tmp25 - tmp15, CONST_BITS-PASS1_BITS); 02191 wsptr[8*6] = (int) RIGHT_SHIFT(tmp26 + tmp16, CONST_BITS-PASS1_BITS); 02192 wsptr[8*7] = (int) RIGHT_SHIFT(tmp26 - tmp16, CONST_BITS-PASS1_BITS); 02193 } 02194 02195 /* Pass 2: process 14 rows from work array, store into output array. */ 02196 02197 wsptr = workspace; 02198 for (ctr = 0; ctr < 14; ctr++) { 02199 outptr = output_buf[ctr] + output_col; 02200 02201 /* Even part */ 02202 02203 /* Add fudge factor here for final descale. */ 02204 z1 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 02205 z1 <<= CONST_BITS; 02206 z4 = (INT32) wsptr[4]; 02207 z2 = MULTIPLY(z4, FIX(1.274162392)); /* c4 */ 02208 z3 = MULTIPLY(z4, FIX(0.314692123)); /* c12 */ 02209 z4 = MULTIPLY(z4, FIX(0.881747734)); /* c8 */ 02210 02211 tmp10 = z1 + z2; 02212 tmp11 = z1 + z3; 02213 tmp12 = z1 - z4; 02214 02215 tmp23 = z1 - ((z2 + z3 - z4) << 1); /* c0 = (c4+c12-c8)*2 */ 02216 02217 z1 = (INT32) wsptr[2]; 02218 z2 = (INT32) wsptr[6]; 02219 02220 z3 = MULTIPLY(z1 + z2, FIX(1.105676686)); /* c6 */ 02221 02222 tmp13 = z3 + MULTIPLY(z1, FIX(0.273079590)); /* c2-c6 */ 02223 tmp14 = z3 - MULTIPLY(z2, FIX(1.719280954)); /* c6+c10 */ 02224 tmp15 = MULTIPLY(z1, FIX(0.613604268)) - /* c10 */ 02225 MULTIPLY(z2, FIX(1.378756276)); /* c2 */ 02226 02227 tmp20 = tmp10 + tmp13; 02228 tmp26 = tmp10 - tmp13; 02229 tmp21 = tmp11 + tmp14; 02230 tmp25 = tmp11 - tmp14; 02231 tmp22 = tmp12 + tmp15; 02232 tmp24 = tmp12 - tmp15; 02233 02234 /* Odd part */ 02235 02236 z1 = (INT32) wsptr[1]; 02237 z2 = (INT32) wsptr[3]; 02238 z3 = (INT32) wsptr[5]; 02239 z4 = (INT32) wsptr[7]; 02240 z4 <<= CONST_BITS; 02241 02242 tmp14 = z1 + z3; 02243 tmp11 = MULTIPLY(z1 + z2, FIX(1.334852607)); /* c3 */ 02244 tmp12 = MULTIPLY(tmp14, FIX(1.197448846)); /* c5 */ 02245 tmp10 = tmp11 + tmp12 + z4 - MULTIPLY(z1, FIX(1.126980169)); /* c3+c5-c1 */ 02246 tmp14 = MULTIPLY(tmp14, FIX(0.752406978)); /* c9 */ 02247 tmp16 = tmp14 - MULTIPLY(z1, FIX(1.061150426)); /* c9+c11-c13 */ 02248 z1 -= z2; 02249 tmp15 = MULTIPLY(z1, FIX(0.467085129)) - z4; /* c11 */ 02250 tmp16 += tmp15; 02251 tmp13 = MULTIPLY(z2 + z3, - FIX(0.158341681)) - z4; /* -c13 */ 02252 tmp11 += tmp13 - MULTIPLY(z2, FIX(0.424103948)); /* c3-c9-c13 */ 02253 tmp12 += tmp13 - MULTIPLY(z3, FIX(2.373959773)); /* c3+c5-c13 */ 02254 tmp13 = MULTIPLY(z3 - z2, FIX(1.405321284)); /* c1 */ 02255 tmp14 += tmp13 + z4 - MULTIPLY(z3, FIX(1.6906431334)); /* c1+c9-c11 */ 02256 tmp15 += tmp13 + MULTIPLY(z2, FIX(0.674957567)); /* c1+c11-c5 */ 02257 02258 tmp13 = ((z1 - z3) << CONST_BITS) + z4; 02259 02260 /* Final output stage */ 02261 02262 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp20 + tmp10, 02263 CONST_BITS+PASS1_BITS+3) 02264 & RANGE_MASK]; 02265 outptr[13] = range_limit[(int) RIGHT_SHIFT(tmp20 - tmp10, 02266 CONST_BITS+PASS1_BITS+3) 02267 & RANGE_MASK]; 02268 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp21 + tmp11, 02269 CONST_BITS+PASS1_BITS+3) 02270 & RANGE_MASK]; 02271 outptr[12] = range_limit[(int) RIGHT_SHIFT(tmp21 - tmp11, 02272 CONST_BITS+PASS1_BITS+3) 02273 & RANGE_MASK]; 02274 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp22 + tmp12, 02275 CONST_BITS+PASS1_BITS+3) 02276 & RANGE_MASK]; 02277 outptr[11] = range_limit[(int) RIGHT_SHIFT(tmp22 - tmp12, 02278 CONST_BITS+PASS1_BITS+3) 02279 & RANGE_MASK]; 02280 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp23 + tmp13, 02281 CONST_BITS+PASS1_BITS+3) 02282 & RANGE_MASK]; 02283 outptr[10] = range_limit[(int) RIGHT_SHIFT(tmp23 - tmp13, 02284 CONST_BITS+PASS1_BITS+3) 02285 & RANGE_MASK]; 02286 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp24 + tmp14, 02287 CONST_BITS+PASS1_BITS+3) 02288 & RANGE_MASK]; 02289 outptr[9] = range_limit[(int) RIGHT_SHIFT(tmp24 - tmp14, 02290 CONST_BITS+PASS1_BITS+3) 02291 & RANGE_MASK]; 02292 outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp25 + tmp15, 02293 CONST_BITS+PASS1_BITS+3) 02294 & RANGE_MASK]; 02295 outptr[8] = range_limit[(int) RIGHT_SHIFT(tmp25 - tmp15, 02296 CONST_BITS+PASS1_BITS+3) 02297 & RANGE_MASK]; 02298 outptr[6] = range_limit[(int) RIGHT_SHIFT(tmp26 + tmp16, 02299 CONST_BITS+PASS1_BITS+3) 02300 & RANGE_MASK]; 02301 outptr[7] = range_limit[(int) RIGHT_SHIFT(tmp26 - tmp16, 02302 CONST_BITS+PASS1_BITS+3) 02303 & RANGE_MASK]; 02304 02305 wsptr += 8; /* advance pointer to next row */ 02306 } 02307 } 02308 02309 02310 /* 02311 * Perform dequantization and inverse DCT on one block of coefficients, 02312 * producing a 15x15 output block. 02313 * 02314 * Optimized algorithm with 22 multiplications in the 1-D kernel. 02315 * cK represents sqrt(2) * cos(K*pi/30). 02316 */ 02317 02318 GLOBAL(void) 02319 jpeg_idct_15x15 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 02320 JCOEFPTR coef_block, 02321 JSAMPARRAY output_buf, JDIMENSION output_col) 02322 { 02323 INT32 tmp10, tmp11, tmp12, tmp13, tmp14, tmp15, tmp16; 02324 INT32 tmp20, tmp21, tmp22, tmp23, tmp24, tmp25, tmp26, tmp27; 02325 INT32 z1, z2, z3, z4; 02326 JCOEFPTR inptr; 02327 ISLOW_MULT_TYPE * quantptr; 02328 int * wsptr; 02329 JSAMPROW outptr; 02330 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 02331 int ctr; 02332 int workspace[8*15]; /* buffers data between passes */ 02333 SHIFT_TEMPS 02334 02335 /* Pass 1: process columns from input, store into work array. */ 02336 02337 inptr = coef_block; 02338 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 02339 wsptr = workspace; 02340 for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) { 02341 /* Even part */ 02342 02343 z1 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 02344 z1 <<= CONST_BITS; 02345 /* Add fudge factor here for final descale. */ 02346 z1 += ONE << (CONST_BITS-PASS1_BITS-1); 02347 02348 z2 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 02349 z3 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 02350 z4 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]); 02351 02352 tmp10 = MULTIPLY(z4, FIX(0.437016024)); /* c12 */ 02353 tmp11 = MULTIPLY(z4, FIX(1.144122806)); /* c6 */ 02354 02355 tmp12 = z1 - tmp10; 02356 tmp13 = z1 + tmp11; 02357 z1 -= (tmp11 - tmp10) << 1; /* c0 = (c6-c12)*2 */ 02358 02359 z4 = z2 - z3; 02360 z3 += z2; 02361 tmp10 = MULTIPLY(z3, FIX(1.337628990)); /* (c2+c4)/2 */ 02362 tmp11 = MULTIPLY(z4, FIX(0.045680613)); /* (c2-c4)/2 */ 02363 z2 = MULTIPLY(z2, FIX(1.439773946)); /* c4+c14 */ 02364 02365 tmp20 = tmp13 + tmp10 + tmp11; 02366 tmp23 = tmp12 - tmp10 + tmp11 + z2; 02367 02368 tmp10 = MULTIPLY(z3, FIX(0.547059574)); /* (c8+c14)/2 */ 02369 tmp11 = MULTIPLY(z4, FIX(0.399234004)); /* (c8-c14)/2 */ 02370 02371 tmp25 = tmp13 - tmp10 - tmp11; 02372 tmp26 = tmp12 + tmp10 - tmp11 - z2; 02373 02374 tmp10 = MULTIPLY(z3, FIX(0.790569415)); /* (c6+c12)/2 */ 02375 tmp11 = MULTIPLY(z4, FIX(0.353553391)); /* (c6-c12)/2 */ 02376 02377 tmp21 = tmp12 + tmp10 + tmp11; 02378 tmp24 = tmp13 - tmp10 + tmp11; 02379 tmp11 += tmp11; 02380 tmp22 = z1 + tmp11; /* c10 = c6-c12 */ 02381 tmp27 = z1 - tmp11 - tmp11; /* c0 = (c6-c12)*2 */ 02382 02383 /* Odd part */ 02384 02385 z1 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 02386 z2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 02387 z4 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]); 02388 z3 = MULTIPLY(z4, FIX(1.224744871)); /* c5 */ 02389 z4 = DEQUANTIZE(inptr[DCTSIZE*7], quantptr[DCTSIZE*7]); 02390 02391 tmp13 = z2 - z4; 02392 tmp15 = MULTIPLY(z1 + tmp13, FIX(0.831253876)); /* c9 */ 02393 tmp11 = tmp15 + MULTIPLY(z1, FIX(0.513743148)); /* c3-c9 */ 02394 tmp14 = tmp15 - MULTIPLY(tmp13, FIX(2.176250899)); /* c3+c9 */ 02395 02396 tmp13 = MULTIPLY(z2, - FIX(0.831253876)); /* -c9 */ 02397 tmp15 = MULTIPLY(z2, - FIX(1.344997024)); /* -c3 */ 02398 z2 = z1 - z4; 02399 tmp12 = z3 + MULTIPLY(z2, FIX(1.406466353)); /* c1 */ 02400 02401 tmp10 = tmp12 + MULTIPLY(z4, FIX(2.457431844)) - tmp15; /* c1+c7 */ 02402 tmp16 = tmp12 - MULTIPLY(z1, FIX(1.112434820)) + tmp13; /* c1-c13 */ 02403 tmp12 = MULTIPLY(z2, FIX(1.224744871)) - z3; /* c5 */ 02404 z2 = MULTIPLY(z1 + z4, FIX(0.575212477)); /* c11 */ 02405 tmp13 += z2 + MULTIPLY(z1, FIX(0.475753014)) - z3; /* c7-c11 */ 02406 tmp15 += z2 - MULTIPLY(z4, FIX(0.869244010)) + z3; /* c11+c13 */ 02407 02408 /* Final output stage */ 02409 02410 wsptr[8*0] = (int) RIGHT_SHIFT(tmp20 + tmp10, CONST_BITS-PASS1_BITS); 02411 wsptr[8*14] = (int) RIGHT_SHIFT(tmp20 - tmp10, CONST_BITS-PASS1_BITS); 02412 wsptr[8*1] = (int) RIGHT_SHIFT(tmp21 + tmp11, CONST_BITS-PASS1_BITS); 02413 wsptr[8*13] = (int) RIGHT_SHIFT(tmp21 - tmp11, CONST_BITS-PASS1_BITS); 02414 wsptr[8*2] = (int) RIGHT_SHIFT(tmp22 + tmp12, CONST_BITS-PASS1_BITS); 02415 wsptr[8*12] = (int) RIGHT_SHIFT(tmp22 - tmp12, CONST_BITS-PASS1_BITS); 02416 wsptr[8*3] = (int) RIGHT_SHIFT(tmp23 + tmp13, CONST_BITS-PASS1_BITS); 02417 wsptr[8*11] = (int) RIGHT_SHIFT(tmp23 - tmp13, CONST_BITS-PASS1_BITS); 02418 wsptr[8*4] = (int) RIGHT_SHIFT(tmp24 + tmp14, CONST_BITS-PASS1_BITS); 02419 wsptr[8*10] = (int) RIGHT_SHIFT(tmp24 - tmp14, CONST_BITS-PASS1_BITS); 02420 wsptr[8*5] = (int) RIGHT_SHIFT(tmp25 + tmp15, CONST_BITS-PASS1_BITS); 02421 wsptr[8*9] = (int) RIGHT_SHIFT(tmp25 - tmp15, CONST_BITS-PASS1_BITS); 02422 wsptr[8*6] = (int) RIGHT_SHIFT(tmp26 + tmp16, CONST_BITS-PASS1_BITS); 02423 wsptr[8*8] = (int) RIGHT_SHIFT(tmp26 - tmp16, CONST_BITS-PASS1_BITS); 02424 wsptr[8*7] = (int) RIGHT_SHIFT(tmp27, CONST_BITS-PASS1_BITS); 02425 } 02426 02427 /* Pass 2: process 15 rows from work array, store into output array. */ 02428 02429 wsptr = workspace; 02430 for (ctr = 0; ctr < 15; ctr++) { 02431 outptr = output_buf[ctr] + output_col; 02432 02433 /* Even part */ 02434 02435 /* Add fudge factor here for final descale. */ 02436 z1 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 02437 z1 <<= CONST_BITS; 02438 02439 z2 = (INT32) wsptr[2]; 02440 z3 = (INT32) wsptr[4]; 02441 z4 = (INT32) wsptr[6]; 02442 02443 tmp10 = MULTIPLY(z4, FIX(0.437016024)); /* c12 */ 02444 tmp11 = MULTIPLY(z4, FIX(1.144122806)); /* c6 */ 02445 02446 tmp12 = z1 - tmp10; 02447 tmp13 = z1 + tmp11; 02448 z1 -= (tmp11 - tmp10) << 1; /* c0 = (c6-c12)*2 */ 02449 02450 z4 = z2 - z3; 02451 z3 += z2; 02452 tmp10 = MULTIPLY(z3, FIX(1.337628990)); /* (c2+c4)/2 */ 02453 tmp11 = MULTIPLY(z4, FIX(0.045680613)); /* (c2-c4)/2 */ 02454 z2 = MULTIPLY(z2, FIX(1.439773946)); /* c4+c14 */ 02455 02456 tmp20 = tmp13 + tmp10 + tmp11; 02457 tmp23 = tmp12 - tmp10 + tmp11 + z2; 02458 02459 tmp10 = MULTIPLY(z3, FIX(0.547059574)); /* (c8+c14)/2 */ 02460 tmp11 = MULTIPLY(z4, FIX(0.399234004)); /* (c8-c14)/2 */ 02461 02462 tmp25 = tmp13 - tmp10 - tmp11; 02463 tmp26 = tmp12 + tmp10 - tmp11 - z2; 02464 02465 tmp10 = MULTIPLY(z3, FIX(0.790569415)); /* (c6+c12)/2 */ 02466 tmp11 = MULTIPLY(z4, FIX(0.353553391)); /* (c6-c12)/2 */ 02467 02468 tmp21 = tmp12 + tmp10 + tmp11; 02469 tmp24 = tmp13 - tmp10 + tmp11; 02470 tmp11 += tmp11; 02471 tmp22 = z1 + tmp11; /* c10 = c6-c12 */ 02472 tmp27 = z1 - tmp11 - tmp11; /* c0 = (c6-c12)*2 */ 02473 02474 /* Odd part */ 02475 02476 z1 = (INT32) wsptr[1]; 02477 z2 = (INT32) wsptr[3]; 02478 z4 = (INT32) wsptr[5]; 02479 z3 = MULTIPLY(z4, FIX(1.224744871)); /* c5 */ 02480 z4 = (INT32) wsptr[7]; 02481 02482 tmp13 = z2 - z4; 02483 tmp15 = MULTIPLY(z1 + tmp13, FIX(0.831253876)); /* c9 */ 02484 tmp11 = tmp15 + MULTIPLY(z1, FIX(0.513743148)); /* c3-c9 */ 02485 tmp14 = tmp15 - MULTIPLY(tmp13, FIX(2.176250899)); /* c3+c9 */ 02486 02487 tmp13 = MULTIPLY(z2, - FIX(0.831253876)); /* -c9 */ 02488 tmp15 = MULTIPLY(z2, - FIX(1.344997024)); /* -c3 */ 02489 z2 = z1 - z4; 02490 tmp12 = z3 + MULTIPLY(z2, FIX(1.406466353)); /* c1 */ 02491 02492 tmp10 = tmp12 + MULTIPLY(z4, FIX(2.457431844)) - tmp15; /* c1+c7 */ 02493 tmp16 = tmp12 - MULTIPLY(z1, FIX(1.112434820)) + tmp13; /* c1-c13 */ 02494 tmp12 = MULTIPLY(z2, FIX(1.224744871)) - z3; /* c5 */ 02495 z2 = MULTIPLY(z1 + z4, FIX(0.575212477)); /* c11 */ 02496 tmp13 += z2 + MULTIPLY(z1, FIX(0.475753014)) - z3; /* c7-c11 */ 02497 tmp15 += z2 - MULTIPLY(z4, FIX(0.869244010)) + z3; /* c11+c13 */ 02498 02499 /* Final output stage */ 02500 02501 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp20 + tmp10, 02502 CONST_BITS+PASS1_BITS+3) 02503 & RANGE_MASK]; 02504 outptr[14] = range_limit[(int) RIGHT_SHIFT(tmp20 - tmp10, 02505 CONST_BITS+PASS1_BITS+3) 02506 & RANGE_MASK]; 02507 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp21 + tmp11, 02508 CONST_BITS+PASS1_BITS+3) 02509 & RANGE_MASK]; 02510 outptr[13] = range_limit[(int) RIGHT_SHIFT(tmp21 - tmp11, 02511 CONST_BITS+PASS1_BITS+3) 02512 & RANGE_MASK]; 02513 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp22 + tmp12, 02514 CONST_BITS+PASS1_BITS+3) 02515 & RANGE_MASK]; 02516 outptr[12] = range_limit[(int) RIGHT_SHIFT(tmp22 - tmp12, 02517 CONST_BITS+PASS1_BITS+3) 02518 & RANGE_MASK]; 02519 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp23 + tmp13, 02520 CONST_BITS+PASS1_BITS+3) 02521 & RANGE_MASK]; 02522 outptr[11] = range_limit[(int) RIGHT_SHIFT(tmp23 - tmp13, 02523 CONST_BITS+PASS1_BITS+3) 02524 & RANGE_MASK]; 02525 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp24 + tmp14, 02526 CONST_BITS+PASS1_BITS+3) 02527 & RANGE_MASK]; 02528 outptr[10] = range_limit[(int) RIGHT_SHIFT(tmp24 - tmp14, 02529 CONST_BITS+PASS1_BITS+3) 02530 & RANGE_MASK]; 02531 outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp25 + tmp15, 02532 CONST_BITS+PASS1_BITS+3) 02533 & RANGE_MASK]; 02534 outptr[9] = range_limit[(int) RIGHT_SHIFT(tmp25 - tmp15, 02535 CONST_BITS+PASS1_BITS+3) 02536 & RANGE_MASK]; 02537 outptr[6] = range_limit[(int) RIGHT_SHIFT(tmp26 + tmp16, 02538 CONST_BITS+PASS1_BITS+3) 02539 & RANGE_MASK]; 02540 outptr[8] = range_limit[(int) RIGHT_SHIFT(tmp26 - tmp16, 02541 CONST_BITS+PASS1_BITS+3) 02542 & RANGE_MASK]; 02543 outptr[7] = range_limit[(int) RIGHT_SHIFT(tmp27, 02544 CONST_BITS+PASS1_BITS+3) 02545 & RANGE_MASK]; 02546 02547 wsptr += 8; /* advance pointer to next row */ 02548 } 02549 } 02550 02551 02552 /* 02553 * Perform dequantization and inverse DCT on one block of coefficients, 02554 * producing a 16x16 output block. 02555 * 02556 * Optimized algorithm with 28 multiplications in the 1-D kernel. 02557 * cK represents sqrt(2) * cos(K*pi/32). 02558 */ 02559 02560 GLOBAL(void) 02561 jpeg_idct_16x16 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 02562 JCOEFPTR coef_block, 02563 JSAMPARRAY output_buf, JDIMENSION output_col) 02564 { 02565 INT32 tmp0, tmp1, tmp2, tmp3, tmp10, tmp11, tmp12, tmp13; 02566 INT32 tmp20, tmp21, tmp22, tmp23, tmp24, tmp25, tmp26, tmp27; 02567 INT32 z1, z2, z3, z4; 02568 JCOEFPTR inptr; 02569 ISLOW_MULT_TYPE * quantptr; 02570 int * wsptr; 02571 JSAMPROW outptr; 02572 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 02573 int ctr; 02574 int workspace[8*16]; /* buffers data between passes */ 02575 SHIFT_TEMPS 02576 02577 /* Pass 1: process columns from input, store into work array. */ 02578 02579 inptr = coef_block; 02580 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 02581 wsptr = workspace; 02582 for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) { 02583 /* Even part */ 02584 02585 tmp0 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 02586 tmp0 <<= CONST_BITS; 02587 /* Add fudge factor here for final descale. */ 02588 tmp0 += 1 << (CONST_BITS-PASS1_BITS-1); 02589 02590 z1 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 02591 tmp1 = MULTIPLY(z1, FIX(1.306562965)); /* c4[16] = c2[8] */ 02592 tmp2 = MULTIPLY(z1, FIX_0_541196100); /* c12[16] = c6[8] */ 02593 02594 tmp10 = tmp0 + tmp1; 02595 tmp11 = tmp0 - tmp1; 02596 tmp12 = tmp0 + tmp2; 02597 tmp13 = tmp0 - tmp2; 02598 02599 z1 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 02600 z2 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]); 02601 z3 = z1 - z2; 02602 z4 = MULTIPLY(z3, FIX(0.275899379)); /* c14[16] = c7[8] */ 02603 z3 = MULTIPLY(z3, FIX(1.387039845)); /* c2[16] = c1[8] */ 02604 02605 tmp0 = z3 + MULTIPLY(z2, FIX_2_562915447); /* (c6+c2)[16] = (c3+c1)[8] */ 02606 tmp1 = z4 + MULTIPLY(z1, FIX_0_899976223); /* (c6-c14)[16] = (c3-c7)[8] */ 02607 tmp2 = z3 - MULTIPLY(z1, FIX(0.601344887)); /* (c2-c10)[16] = (c1-c5)[8] */ 02608 tmp3 = z4 - MULTIPLY(z2, FIX(0.509795579)); /* (c10-c14)[16] = (c5-c7)[8] */ 02609 02610 tmp20 = tmp10 + tmp0; 02611 tmp27 = tmp10 - tmp0; 02612 tmp21 = tmp12 + tmp1; 02613 tmp26 = tmp12 - tmp1; 02614 tmp22 = tmp13 + tmp2; 02615 tmp25 = tmp13 - tmp2; 02616 tmp23 = tmp11 + tmp3; 02617 tmp24 = tmp11 - tmp3; 02618 02619 /* Odd part */ 02620 02621 z1 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 02622 z2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 02623 z3 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]); 02624 z4 = DEQUANTIZE(inptr[DCTSIZE*7], quantptr[DCTSIZE*7]); 02625 02626 tmp11 = z1 + z3; 02627 02628 tmp1 = MULTIPLY(z1 + z2, FIX(1.353318001)); /* c3 */ 02629 tmp2 = MULTIPLY(tmp11, FIX(1.247225013)); /* c5 */ 02630 tmp3 = MULTIPLY(z1 + z4, FIX(1.093201867)); /* c7 */ 02631 tmp10 = MULTIPLY(z1 - z4, FIX(0.897167586)); /* c9 */ 02632 tmp11 = MULTIPLY(tmp11, FIX(0.666655658)); /* c11 */ 02633 tmp12 = MULTIPLY(z1 - z2, FIX(0.410524528)); /* c13 */ 02634 tmp0 = tmp1 + tmp2 + tmp3 - 02635 MULTIPLY(z1, FIX(2.286341144)); /* c7+c5+c3-c1 */ 02636 tmp13 = tmp10 + tmp11 + tmp12 - 02637 MULTIPLY(z1, FIX(1.835730603)); /* c9+c11+c13-c15 */ 02638 z1 = MULTIPLY(z2 + z3, FIX(0.138617169)); /* c15 */ 02639 tmp1 += z1 + MULTIPLY(z2, FIX(0.071888074)); /* c9+c11-c3-c15 */ 02640 tmp2 += z1 - MULTIPLY(z3, FIX(1.125726048)); /* c5+c7+c15-c3 */ 02641 z1 = MULTIPLY(z3 - z2, FIX(1.407403738)); /* c1 */ 02642 tmp11 += z1 - MULTIPLY(z3, FIX(0.766367282)); /* c1+c11-c9-c13 */ 02643 tmp12 += z1 + MULTIPLY(z2, FIX(1.971951411)); /* c1+c5+c13-c7 */ 02644 z2 += z4; 02645 z1 = MULTIPLY(z2, - FIX(0.666655658)); /* -c11 */ 02646 tmp1 += z1; 02647 tmp3 += z1 + MULTIPLY(z4, FIX(1.065388962)); /* c3+c11+c15-c7 */ 02648 z2 = MULTIPLY(z2, - FIX(1.247225013)); /* -c5 */ 02649 tmp10 += z2 + MULTIPLY(z4, FIX(3.141271809)); /* c1+c5+c9-c13 */ 02650 tmp12 += z2; 02651 z2 = MULTIPLY(z3 + z4, - FIX(1.353318001)); /* -c3 */ 02652 tmp2 += z2; 02653 tmp3 += z2; 02654 z2 = MULTIPLY(z4 - z3, FIX(0.410524528)); /* c13 */ 02655 tmp10 += z2; 02656 tmp11 += z2; 02657 02658 /* Final output stage */ 02659 02660 wsptr[8*0] = (int) RIGHT_SHIFT(tmp20 + tmp0, CONST_BITS-PASS1_BITS); 02661 wsptr[8*15] = (int) RIGHT_SHIFT(tmp20 - tmp0, CONST_BITS-PASS1_BITS); 02662 wsptr[8*1] = (int) RIGHT_SHIFT(tmp21 + tmp1, CONST_BITS-PASS1_BITS); 02663 wsptr[8*14] = (int) RIGHT_SHIFT(tmp21 - tmp1, CONST_BITS-PASS1_BITS); 02664 wsptr[8*2] = (int) RIGHT_SHIFT(tmp22 + tmp2, CONST_BITS-PASS1_BITS); 02665 wsptr[8*13] = (int) RIGHT_SHIFT(tmp22 - tmp2, CONST_BITS-PASS1_BITS); 02666 wsptr[8*3] = (int) RIGHT_SHIFT(tmp23 + tmp3, CONST_BITS-PASS1_BITS); 02667 wsptr[8*12] = (int) RIGHT_SHIFT(tmp23 - tmp3, CONST_BITS-PASS1_BITS); 02668 wsptr[8*4] = (int) RIGHT_SHIFT(tmp24 + tmp10, CONST_BITS-PASS1_BITS); 02669 wsptr[8*11] = (int) RIGHT_SHIFT(tmp24 - tmp10, CONST_BITS-PASS1_BITS); 02670 wsptr[8*5] = (int) RIGHT_SHIFT(tmp25 + tmp11, CONST_BITS-PASS1_BITS); 02671 wsptr[8*10] = (int) RIGHT_SHIFT(tmp25 - tmp11, CONST_BITS-PASS1_BITS); 02672 wsptr[8*6] = (int) RIGHT_SHIFT(tmp26 + tmp12, CONST_BITS-PASS1_BITS); 02673 wsptr[8*9] = (int) RIGHT_SHIFT(tmp26 - tmp12, CONST_BITS-PASS1_BITS); 02674 wsptr[8*7] = (int) RIGHT_SHIFT(tmp27 + tmp13, CONST_BITS-PASS1_BITS); 02675 wsptr[8*8] = (int) RIGHT_SHIFT(tmp27 - tmp13, CONST_BITS-PASS1_BITS); 02676 } 02677 02678 /* Pass 2: process 16 rows from work array, store into output array. */ 02679 02680 wsptr = workspace; 02681 for (ctr = 0; ctr < 16; ctr++) { 02682 outptr = output_buf[ctr] + output_col; 02683 02684 /* Even part */ 02685 02686 /* Add fudge factor here for final descale. */ 02687 tmp0 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 02688 tmp0 <<= CONST_BITS; 02689 02690 z1 = (INT32) wsptr[4]; 02691 tmp1 = MULTIPLY(z1, FIX(1.306562965)); /* c4[16] = c2[8] */ 02692 tmp2 = MULTIPLY(z1, FIX_0_541196100); /* c12[16] = c6[8] */ 02693 02694 tmp10 = tmp0 + tmp1; 02695 tmp11 = tmp0 - tmp1; 02696 tmp12 = tmp0 + tmp2; 02697 tmp13 = tmp0 - tmp2; 02698 02699 z1 = (INT32) wsptr[2]; 02700 z2 = (INT32) wsptr[6]; 02701 z3 = z1 - z2; 02702 z4 = MULTIPLY(z3, FIX(0.275899379)); /* c14[16] = c7[8] */ 02703 z3 = MULTIPLY(z3, FIX(1.387039845)); /* c2[16] = c1[8] */ 02704 02705 tmp0 = z3 + MULTIPLY(z2, FIX_2_562915447); /* (c6+c2)[16] = (c3+c1)[8] */ 02706 tmp1 = z4 + MULTIPLY(z1, FIX_0_899976223); /* (c6-c14)[16] = (c3-c7)[8] */ 02707 tmp2 = z3 - MULTIPLY(z1, FIX(0.601344887)); /* (c2-c10)[16] = (c1-c5)[8] */ 02708 tmp3 = z4 - MULTIPLY(z2, FIX(0.509795579)); /* (c10-c14)[16] = (c5-c7)[8] */ 02709 02710 tmp20 = tmp10 + tmp0; 02711 tmp27 = tmp10 - tmp0; 02712 tmp21 = tmp12 + tmp1; 02713 tmp26 = tmp12 - tmp1; 02714 tmp22 = tmp13 + tmp2; 02715 tmp25 = tmp13 - tmp2; 02716 tmp23 = tmp11 + tmp3; 02717 tmp24 = tmp11 - tmp3; 02718 02719 /* Odd part */ 02720 02721 z1 = (INT32) wsptr[1]; 02722 z2 = (INT32) wsptr[3]; 02723 z3 = (INT32) wsptr[5]; 02724 z4 = (INT32) wsptr[7]; 02725 02726 tmp11 = z1 + z3; 02727 02728 tmp1 = MULTIPLY(z1 + z2, FIX(1.353318001)); /* c3 */ 02729 tmp2 = MULTIPLY(tmp11, FIX(1.247225013)); /* c5 */ 02730 tmp3 = MULTIPLY(z1 + z4, FIX(1.093201867)); /* c7 */ 02731 tmp10 = MULTIPLY(z1 - z4, FIX(0.897167586)); /* c9 */ 02732 tmp11 = MULTIPLY(tmp11, FIX(0.666655658)); /* c11 */ 02733 tmp12 = MULTIPLY(z1 - z2, FIX(0.410524528)); /* c13 */ 02734 tmp0 = tmp1 + tmp2 + tmp3 - 02735 MULTIPLY(z1, FIX(2.286341144)); /* c7+c5+c3-c1 */ 02736 tmp13 = tmp10 + tmp11 + tmp12 - 02737 MULTIPLY(z1, FIX(1.835730603)); /* c9+c11+c13-c15 */ 02738 z1 = MULTIPLY(z2 + z3, FIX(0.138617169)); /* c15 */ 02739 tmp1 += z1 + MULTIPLY(z2, FIX(0.071888074)); /* c9+c11-c3-c15 */ 02740 tmp2 += z1 - MULTIPLY(z3, FIX(1.125726048)); /* c5+c7+c15-c3 */ 02741 z1 = MULTIPLY(z3 - z2, FIX(1.407403738)); /* c1 */ 02742 tmp11 += z1 - MULTIPLY(z3, FIX(0.766367282)); /* c1+c11-c9-c13 */ 02743 tmp12 += z1 + MULTIPLY(z2, FIX(1.971951411)); /* c1+c5+c13-c7 */ 02744 z2 += z4; 02745 z1 = MULTIPLY(z2, - FIX(0.666655658)); /* -c11 */ 02746 tmp1 += z1; 02747 tmp3 += z1 + MULTIPLY(z4, FIX(1.065388962)); /* c3+c11+c15-c7 */ 02748 z2 = MULTIPLY(z2, - FIX(1.247225013)); /* -c5 */ 02749 tmp10 += z2 + MULTIPLY(z4, FIX(3.141271809)); /* c1+c5+c9-c13 */ 02750 tmp12 += z2; 02751 z2 = MULTIPLY(z3 + z4, - FIX(1.353318001)); /* -c3 */ 02752 tmp2 += z2; 02753 tmp3 += z2; 02754 z2 = MULTIPLY(z4 - z3, FIX(0.410524528)); /* c13 */ 02755 tmp10 += z2; 02756 tmp11 += z2; 02757 02758 /* Final output stage */ 02759 02760 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp20 + tmp0, 02761 CONST_BITS+PASS1_BITS+3) 02762 & RANGE_MASK]; 02763 outptr[15] = range_limit[(int) RIGHT_SHIFT(tmp20 - tmp0, 02764 CONST_BITS+PASS1_BITS+3) 02765 & RANGE_MASK]; 02766 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp21 + tmp1, 02767 CONST_BITS+PASS1_BITS+3) 02768 & RANGE_MASK]; 02769 outptr[14] = range_limit[(int) RIGHT_SHIFT(tmp21 - tmp1, 02770 CONST_BITS+PASS1_BITS+3) 02771 & RANGE_MASK]; 02772 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp22 + tmp2, 02773 CONST_BITS+PASS1_BITS+3) 02774 & RANGE_MASK]; 02775 outptr[13] = range_limit[(int) RIGHT_SHIFT(tmp22 - tmp2, 02776 CONST_BITS+PASS1_BITS+3) 02777 & RANGE_MASK]; 02778 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp23 + tmp3, 02779 CONST_BITS+PASS1_BITS+3) 02780 & RANGE_MASK]; 02781 outptr[12] = range_limit[(int) RIGHT_SHIFT(tmp23 - tmp3, 02782 CONST_BITS+PASS1_BITS+3) 02783 & RANGE_MASK]; 02784 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp24 + tmp10, 02785 CONST_BITS+PASS1_BITS+3) 02786 & RANGE_MASK]; 02787 outptr[11] = range_limit[(int) RIGHT_SHIFT(tmp24 - tmp10, 02788 CONST_BITS+PASS1_BITS+3) 02789 & RANGE_MASK]; 02790 outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp25 + tmp11, 02791 CONST_BITS+PASS1_BITS+3) 02792 & RANGE_MASK]; 02793 outptr[10] = range_limit[(int) RIGHT_SHIFT(tmp25 - tmp11, 02794 CONST_BITS+PASS1_BITS+3) 02795 & RANGE_MASK]; 02796 outptr[6] = range_limit[(int) RIGHT_SHIFT(tmp26 + tmp12, 02797 CONST_BITS+PASS1_BITS+3) 02798 & RANGE_MASK]; 02799 outptr[9] = range_limit[(int) RIGHT_SHIFT(tmp26 - tmp12, 02800 CONST_BITS+PASS1_BITS+3) 02801 & RANGE_MASK]; 02802 outptr[7] = range_limit[(int) RIGHT_SHIFT(tmp27 + tmp13, 02803 CONST_BITS+PASS1_BITS+3) 02804 & RANGE_MASK]; 02805 outptr[8] = range_limit[(int) RIGHT_SHIFT(tmp27 - tmp13, 02806 CONST_BITS+PASS1_BITS+3) 02807 & RANGE_MASK]; 02808 02809 wsptr += 8; /* advance pointer to next row */ 02810 } 02811 } 02812 02813 02814 /* 02815 * Perform dequantization and inverse DCT on one block of coefficients, 02816 * producing a 16x8 output block. 02817 * 02818 * 8-point IDCT in pass 1 (columns), 16-point in pass 2 (rows). 02819 */ 02820 02821 GLOBAL(void) 02822 jpeg_idct_16x8 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 02823 JCOEFPTR coef_block, 02824 JSAMPARRAY output_buf, JDIMENSION output_col) 02825 { 02826 INT32 tmp0, tmp1, tmp2, tmp3, tmp10, tmp11, tmp12, tmp13; 02827 INT32 tmp20, tmp21, tmp22, tmp23, tmp24, tmp25, tmp26, tmp27; 02828 INT32 z1, z2, z3, z4; 02829 JCOEFPTR inptr; 02830 ISLOW_MULT_TYPE * quantptr; 02831 int * wsptr; 02832 JSAMPROW outptr; 02833 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 02834 int ctr; 02835 int workspace[8*8]; /* buffers data between passes */ 02836 SHIFT_TEMPS 02837 02838 /* Pass 1: process columns from input, store into work array. */ 02839 /* Note results are scaled up by sqrt(8) compared to a true IDCT; */ 02840 /* furthermore, we scale the results by 2**PASS1_BITS. */ 02841 02842 inptr = coef_block; 02843 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 02844 wsptr = workspace; 02845 for (ctr = DCTSIZE; ctr > 0; ctr--) { 02846 /* Due to quantization, we will usually find that many of the input 02847 * coefficients are zero, especially the AC terms. We can exploit this 02848 * by short-circuiting the IDCT calculation for any column in which all 02849 * the AC terms are zero. In that case each output is equal to the 02850 * DC coefficient (with scale factor as needed). 02851 * With typical images and quantization tables, half or more of the 02852 * column DCT calculations can be simplified this way. 02853 */ 02854 02855 if (inptr[DCTSIZE*1] == 0 && inptr[DCTSIZE*2] == 0 && 02856 inptr[DCTSIZE*3] == 0 && inptr[DCTSIZE*4] == 0 && 02857 inptr[DCTSIZE*5] == 0 && inptr[DCTSIZE*6] == 0 && 02858 inptr[DCTSIZE*7] == 0) { 02859 /* AC terms all zero */ 02860 int dcval = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]) << PASS1_BITS; 02861 02862 wsptr[DCTSIZE*0] = dcval; 02863 wsptr[DCTSIZE*1] = dcval; 02864 wsptr[DCTSIZE*2] = dcval; 02865 wsptr[DCTSIZE*3] = dcval; 02866 wsptr[DCTSIZE*4] = dcval; 02867 wsptr[DCTSIZE*5] = dcval; 02868 wsptr[DCTSIZE*6] = dcval; 02869 wsptr[DCTSIZE*7] = dcval; 02870 02871 inptr++; /* advance pointers to next column */ 02872 quantptr++; 02873 wsptr++; 02874 continue; 02875 } 02876 02877 /* Even part: reverse the even part of the forward DCT. */ 02878 /* The rotator is sqrt(2)*c(-6). */ 02879 02880 z2 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 02881 z3 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]); 02882 02883 z1 = MULTIPLY(z2 + z3, FIX_0_541196100); 02884 tmp2 = z1 + MULTIPLY(z2, FIX_0_765366865); 02885 tmp3 = z1 - MULTIPLY(z3, FIX_1_847759065); 02886 02887 z2 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 02888 z3 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 02889 z2 <<= CONST_BITS; 02890 z3 <<= CONST_BITS; 02891 /* Add fudge factor here for final descale. */ 02892 z2 += ONE << (CONST_BITS-PASS1_BITS-1); 02893 02894 tmp0 = z2 + z3; 02895 tmp1 = z2 - z3; 02896 02897 tmp10 = tmp0 + tmp2; 02898 tmp13 = tmp0 - tmp2; 02899 tmp11 = tmp1 + tmp3; 02900 tmp12 = tmp1 - tmp3; 02901 02902 /* Odd part per figure 8; the matrix is unitary and hence its 02903 * transpose is its inverse. i0..i3 are y7,y5,y3,y1 respectively. 02904 */ 02905 02906 tmp0 = DEQUANTIZE(inptr[DCTSIZE*7], quantptr[DCTSIZE*7]); 02907 tmp1 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]); 02908 tmp2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 02909 tmp3 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 02910 02911 z2 = tmp0 + tmp2; 02912 z3 = tmp1 + tmp3; 02913 02914 z1 = MULTIPLY(z2 + z3, FIX_1_175875602); /* sqrt(2) * c3 */ 02915 z2 = MULTIPLY(z2, - FIX_1_961570560); /* sqrt(2) * (-c3-c5) */ 02916 z3 = MULTIPLY(z3, - FIX_0_390180644); /* sqrt(2) * (c5-c3) */ 02917 z2 += z1; 02918 z3 += z1; 02919 02920 z1 = MULTIPLY(tmp0 + tmp3, - FIX_0_899976223); /* sqrt(2) * (c7-c3) */ 02921 tmp0 = MULTIPLY(tmp0, FIX_0_298631336); /* sqrt(2) * (-c1+c3+c5-c7) */ 02922 tmp3 = MULTIPLY(tmp3, FIX_1_501321110); /* sqrt(2) * ( c1+c3-c5-c7) */ 02923 tmp0 += z1 + z2; 02924 tmp3 += z1 + z3; 02925 02926 z1 = MULTIPLY(tmp1 + tmp2, - FIX_2_562915447); /* sqrt(2) * (-c1-c3) */ 02927 tmp1 = MULTIPLY(tmp1, FIX_2_053119869); /* sqrt(2) * ( c1+c3-c5+c7) */ 02928 tmp2 = MULTIPLY(tmp2, FIX_3_072711026); /* sqrt(2) * ( c1+c3+c5-c7) */ 02929 tmp1 += z1 + z3; 02930 tmp2 += z1 + z2; 02931 02932 /* Final output stage: inputs are tmp10..tmp13, tmp0..tmp3 */ 02933 02934 wsptr[DCTSIZE*0] = (int) RIGHT_SHIFT(tmp10 + tmp3, CONST_BITS-PASS1_BITS); 02935 wsptr[DCTSIZE*7] = (int) RIGHT_SHIFT(tmp10 - tmp3, CONST_BITS-PASS1_BITS); 02936 wsptr[DCTSIZE*1] = (int) RIGHT_SHIFT(tmp11 + tmp2, CONST_BITS-PASS1_BITS); 02937 wsptr[DCTSIZE*6] = (int) RIGHT_SHIFT(tmp11 - tmp2, CONST_BITS-PASS1_BITS); 02938 wsptr[DCTSIZE*2] = (int) RIGHT_SHIFT(tmp12 + tmp1, CONST_BITS-PASS1_BITS); 02939 wsptr[DCTSIZE*5] = (int) RIGHT_SHIFT(tmp12 - tmp1, CONST_BITS-PASS1_BITS); 02940 wsptr[DCTSIZE*3] = (int) RIGHT_SHIFT(tmp13 + tmp0, CONST_BITS-PASS1_BITS); 02941 wsptr[DCTSIZE*4] = (int) RIGHT_SHIFT(tmp13 - tmp0, CONST_BITS-PASS1_BITS); 02942 02943 inptr++; /* advance pointers to next column */ 02944 quantptr++; 02945 wsptr++; 02946 } 02947 02948 /* Pass 2: process 8 rows from work array, store into output array. 02949 * 16-point IDCT kernel, cK represents sqrt(2) * cos(K*pi/32). 02950 */ 02951 wsptr = workspace; 02952 for (ctr = 0; ctr < 8; ctr++) { 02953 outptr = output_buf[ctr] + output_col; 02954 02955 /* Even part */ 02956 02957 /* Add fudge factor here for final descale. */ 02958 tmp0 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 02959 tmp0 <<= CONST_BITS; 02960 02961 z1 = (INT32) wsptr[4]; 02962 tmp1 = MULTIPLY(z1, FIX(1.306562965)); /* c4[16] = c2[8] */ 02963 tmp2 = MULTIPLY(z1, FIX_0_541196100); /* c12[16] = c6[8] */ 02964 02965 tmp10 = tmp0 + tmp1; 02966 tmp11 = tmp0 - tmp1; 02967 tmp12 = tmp0 + tmp2; 02968 tmp13 = tmp0 - tmp2; 02969 02970 z1 = (INT32) wsptr[2]; 02971 z2 = (INT32) wsptr[6]; 02972 z3 = z1 - z2; 02973 z4 = MULTIPLY(z3, FIX(0.275899379)); /* c14[16] = c7[8] */ 02974 z3 = MULTIPLY(z3, FIX(1.387039845)); /* c2[16] = c1[8] */ 02975 02976 tmp0 = z3 + MULTIPLY(z2, FIX_2_562915447); /* (c6+c2)[16] = (c3+c1)[8] */ 02977 tmp1 = z4 + MULTIPLY(z1, FIX_0_899976223); /* (c6-c14)[16] = (c3-c7)[8] */ 02978 tmp2 = z3 - MULTIPLY(z1, FIX(0.601344887)); /* (c2-c10)[16] = (c1-c5)[8] */ 02979 tmp3 = z4 - MULTIPLY(z2, FIX(0.509795579)); /* (c10-c14)[16] = (c5-c7)[8] */ 02980 02981 tmp20 = tmp10 + tmp0; 02982 tmp27 = tmp10 - tmp0; 02983 tmp21 = tmp12 + tmp1; 02984 tmp26 = tmp12 - tmp1; 02985 tmp22 = tmp13 + tmp2; 02986 tmp25 = tmp13 - tmp2; 02987 tmp23 = tmp11 + tmp3; 02988 tmp24 = tmp11 - tmp3; 02989 02990 /* Odd part */ 02991 02992 z1 = (INT32) wsptr[1]; 02993 z2 = (INT32) wsptr[3]; 02994 z3 = (INT32) wsptr[5]; 02995 z4 = (INT32) wsptr[7]; 02996 02997 tmp11 = z1 + z3; 02998 02999 tmp1 = MULTIPLY(z1 + z2, FIX(1.353318001)); /* c3 */ 03000 tmp2 = MULTIPLY(tmp11, FIX(1.247225013)); /* c5 */ 03001 tmp3 = MULTIPLY(z1 + z4, FIX(1.093201867)); /* c7 */ 03002 tmp10 = MULTIPLY(z1 - z4, FIX(0.897167586)); /* c9 */ 03003 tmp11 = MULTIPLY(tmp11, FIX(0.666655658)); /* c11 */ 03004 tmp12 = MULTIPLY(z1 - z2, FIX(0.410524528)); /* c13 */ 03005 tmp0 = tmp1 + tmp2 + tmp3 - 03006 MULTIPLY(z1, FIX(2.286341144)); /* c7+c5+c3-c1 */ 03007 tmp13 = tmp10 + tmp11 + tmp12 - 03008 MULTIPLY(z1, FIX(1.835730603)); /* c9+c11+c13-c15 */ 03009 z1 = MULTIPLY(z2 + z3, FIX(0.138617169)); /* c15 */ 03010 tmp1 += z1 + MULTIPLY(z2, FIX(0.071888074)); /* c9+c11-c3-c15 */ 03011 tmp2 += z1 - MULTIPLY(z3, FIX(1.125726048)); /* c5+c7+c15-c3 */ 03012 z1 = MULTIPLY(z3 - z2, FIX(1.407403738)); /* c1 */ 03013 tmp11 += z1 - MULTIPLY(z3, FIX(0.766367282)); /* c1+c11-c9-c13 */ 03014 tmp12 += z1 + MULTIPLY(z2, FIX(1.971951411)); /* c1+c5+c13-c7 */ 03015 z2 += z4; 03016 z1 = MULTIPLY(z2, - FIX(0.666655658)); /* -c11 */ 03017 tmp1 += z1; 03018 tmp3 += z1 + MULTIPLY(z4, FIX(1.065388962)); /* c3+c11+c15-c7 */ 03019 z2 = MULTIPLY(z2, - FIX(1.247225013)); /* -c5 */ 03020 tmp10 += z2 + MULTIPLY(z4, FIX(3.141271809)); /* c1+c5+c9-c13 */ 03021 tmp12 += z2; 03022 z2 = MULTIPLY(z3 + z4, - FIX(1.353318001)); /* -c3 */ 03023 tmp2 += z2; 03024 tmp3 += z2; 03025 z2 = MULTIPLY(z4 - z3, FIX(0.410524528)); /* c13 */ 03026 tmp10 += z2; 03027 tmp11 += z2; 03028 03029 /* Final output stage */ 03030 03031 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp20 + tmp0, 03032 CONST_BITS+PASS1_BITS+3) 03033 & RANGE_MASK]; 03034 outptr[15] = range_limit[(int) RIGHT_SHIFT(tmp20 - tmp0, 03035 CONST_BITS+PASS1_BITS+3) 03036 & RANGE_MASK]; 03037 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp21 + tmp1, 03038 CONST_BITS+PASS1_BITS+3) 03039 & RANGE_MASK]; 03040 outptr[14] = range_limit[(int) RIGHT_SHIFT(tmp21 - tmp1, 03041 CONST_BITS+PASS1_BITS+3) 03042 & RANGE_MASK]; 03043 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp22 + tmp2, 03044 CONST_BITS+PASS1_BITS+3) 03045 & RANGE_MASK]; 03046 outptr[13] = range_limit[(int) RIGHT_SHIFT(tmp22 - tmp2, 03047 CONST_BITS+PASS1_BITS+3) 03048 & RANGE_MASK]; 03049 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp23 + tmp3, 03050 CONST_BITS+PASS1_BITS+3) 03051 & RANGE_MASK]; 03052 outptr[12] = range_limit[(int) RIGHT_SHIFT(tmp23 - tmp3, 03053 CONST_BITS+PASS1_BITS+3) 03054 & RANGE_MASK]; 03055 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp24 + tmp10, 03056 CONST_BITS+PASS1_BITS+3) 03057 & RANGE_MASK]; 03058 outptr[11] = range_limit[(int) RIGHT_SHIFT(tmp24 - tmp10, 03059 CONST_BITS+PASS1_BITS+3) 03060 & RANGE_MASK]; 03061 outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp25 + tmp11, 03062 CONST_BITS+PASS1_BITS+3) 03063 & RANGE_MASK]; 03064 outptr[10] = range_limit[(int) RIGHT_SHIFT(tmp25 - tmp11, 03065 CONST_BITS+PASS1_BITS+3) 03066 & RANGE_MASK]; 03067 outptr[6] = range_limit[(int) RIGHT_SHIFT(tmp26 + tmp12, 03068 CONST_BITS+PASS1_BITS+3) 03069 & RANGE_MASK]; 03070 outptr[9] = range_limit[(int) RIGHT_SHIFT(tmp26 - tmp12, 03071 CONST_BITS+PASS1_BITS+3) 03072 & RANGE_MASK]; 03073 outptr[7] = range_limit[(int) RIGHT_SHIFT(tmp27 + tmp13, 03074 CONST_BITS+PASS1_BITS+3) 03075 & RANGE_MASK]; 03076 outptr[8] = range_limit[(int) RIGHT_SHIFT(tmp27 - tmp13, 03077 CONST_BITS+PASS1_BITS+3) 03078 & RANGE_MASK]; 03079 03080 wsptr += 8; /* advance pointer to next row */ 03081 } 03082 } 03083 03084 03085 /* 03086 * Perform dequantization and inverse DCT on one block of coefficients, 03087 * producing a 14x7 output block. 03088 * 03089 * 7-point IDCT in pass 1 (columns), 14-point in pass 2 (rows). 03090 */ 03091 03092 GLOBAL(void) 03093 jpeg_idct_14x7 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 03094 JCOEFPTR coef_block, 03095 JSAMPARRAY output_buf, JDIMENSION output_col) 03096 { 03097 INT32 tmp10, tmp11, tmp12, tmp13, tmp14, tmp15, tmp16; 03098 INT32 tmp20, tmp21, tmp22, tmp23, tmp24, tmp25, tmp26; 03099 INT32 z1, z2, z3, z4; 03100 JCOEFPTR inptr; 03101 ISLOW_MULT_TYPE * quantptr; 03102 int * wsptr; 03103 JSAMPROW outptr; 03104 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 03105 int ctr; 03106 int workspace[8*7]; /* buffers data between passes */ 03107 SHIFT_TEMPS 03108 03109 /* Pass 1: process columns from input, store into work array. 03110 * 7-point IDCT kernel, cK represents sqrt(2) * cos(K*pi/14). 03111 */ 03112 inptr = coef_block; 03113 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 03114 wsptr = workspace; 03115 for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) { 03116 /* Even part */ 03117 03118 tmp23 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 03119 tmp23 <<= CONST_BITS; 03120 /* Add fudge factor here for final descale. */ 03121 tmp23 += ONE << (CONST_BITS-PASS1_BITS-1); 03122 03123 z1 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 03124 z2 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 03125 z3 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]); 03126 03127 tmp20 = MULTIPLY(z2 - z3, FIX(0.881747734)); /* c4 */ 03128 tmp22 = MULTIPLY(z1 - z2, FIX(0.314692123)); /* c6 */ 03129 tmp21 = tmp20 + tmp22 + tmp23 - MULTIPLY(z2, FIX(1.841218003)); /* c2+c4-c6 */ 03130 tmp10 = z1 + z3; 03131 z2 -= tmp10; 03132 tmp10 = MULTIPLY(tmp10, FIX(1.274162392)) + tmp23; /* c2 */ 03133 tmp20 += tmp10 - MULTIPLY(z3, FIX(0.077722536)); /* c2-c4-c6 */ 03134 tmp22 += tmp10 - MULTIPLY(z1, FIX(2.470602249)); /* c2+c4+c6 */ 03135 tmp23 += MULTIPLY(z2, FIX(1.414213562)); /* c0 */ 03136 03137 /* Odd part */ 03138 03139 z1 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 03140 z2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 03141 z3 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]); 03142 03143 tmp11 = MULTIPLY(z1 + z2, FIX(0.935414347)); /* (c3+c1-c5)/2 */ 03144 tmp12 = MULTIPLY(z1 - z2, FIX(0.170262339)); /* (c3+c5-c1)/2 */ 03145 tmp10 = tmp11 - tmp12; 03146 tmp11 += tmp12; 03147 tmp12 = MULTIPLY(z2 + z3, - FIX(1.378756276)); /* -c1 */ 03148 tmp11 += tmp12; 03149 z2 = MULTIPLY(z1 + z3, FIX(0.613604268)); /* c5 */ 03150 tmp10 += z2; 03151 tmp12 += z2 + MULTIPLY(z3, FIX(1.870828693)); /* c3+c1-c5 */ 03152 03153 /* Final output stage */ 03154 03155 wsptr[8*0] = (int) RIGHT_SHIFT(tmp20 + tmp10, CONST_BITS-PASS1_BITS); 03156 wsptr[8*6] = (int) RIGHT_SHIFT(tmp20 - tmp10, CONST_BITS-PASS1_BITS); 03157 wsptr[8*1] = (int) RIGHT_SHIFT(tmp21 + tmp11, CONST_BITS-PASS1_BITS); 03158 wsptr[8*5] = (int) RIGHT_SHIFT(tmp21 - tmp11, CONST_BITS-PASS1_BITS); 03159 wsptr[8*2] = (int) RIGHT_SHIFT(tmp22 + tmp12, CONST_BITS-PASS1_BITS); 03160 wsptr[8*4] = (int) RIGHT_SHIFT(tmp22 - tmp12, CONST_BITS-PASS1_BITS); 03161 wsptr[8*3] = (int) RIGHT_SHIFT(tmp23, CONST_BITS-PASS1_BITS); 03162 } 03163 03164 /* Pass 2: process 7 rows from work array, store into output array. 03165 * 14-point IDCT kernel, cK represents sqrt(2) * cos(K*pi/28). 03166 */ 03167 wsptr = workspace; 03168 for (ctr = 0; ctr < 7; ctr++) { 03169 outptr = output_buf[ctr] + output_col; 03170 03171 /* Even part */ 03172 03173 /* Add fudge factor here for final descale. */ 03174 z1 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 03175 z1 <<= CONST_BITS; 03176 z4 = (INT32) wsptr[4]; 03177 z2 = MULTIPLY(z4, FIX(1.274162392)); /* c4 */ 03178 z3 = MULTIPLY(z4, FIX(0.314692123)); /* c12 */ 03179 z4 = MULTIPLY(z4, FIX(0.881747734)); /* c8 */ 03180 03181 tmp10 = z1 + z2; 03182 tmp11 = z1 + z3; 03183 tmp12 = z1 - z4; 03184 03185 tmp23 = z1 - ((z2 + z3 - z4) << 1); /* c0 = (c4+c12-c8)*2 */ 03186 03187 z1 = (INT32) wsptr[2]; 03188 z2 = (INT32) wsptr[6]; 03189 03190 z3 = MULTIPLY(z1 + z2, FIX(1.105676686)); /* c6 */ 03191 03192 tmp13 = z3 + MULTIPLY(z1, FIX(0.273079590)); /* c2-c6 */ 03193 tmp14 = z3 - MULTIPLY(z2, FIX(1.719280954)); /* c6+c10 */ 03194 tmp15 = MULTIPLY(z1, FIX(0.613604268)) - /* c10 */ 03195 MULTIPLY(z2, FIX(1.378756276)); /* c2 */ 03196 03197 tmp20 = tmp10 + tmp13; 03198 tmp26 = tmp10 - tmp13; 03199 tmp21 = tmp11 + tmp14; 03200 tmp25 = tmp11 - tmp14; 03201 tmp22 = tmp12 + tmp15; 03202 tmp24 = tmp12 - tmp15; 03203 03204 /* Odd part */ 03205 03206 z1 = (INT32) wsptr[1]; 03207 z2 = (INT32) wsptr[3]; 03208 z3 = (INT32) wsptr[5]; 03209 z4 = (INT32) wsptr[7]; 03210 z4 <<= CONST_BITS; 03211 03212 tmp14 = z1 + z3; 03213 tmp11 = MULTIPLY(z1 + z2, FIX(1.334852607)); /* c3 */ 03214 tmp12 = MULTIPLY(tmp14, FIX(1.197448846)); /* c5 */ 03215 tmp10 = tmp11 + tmp12 + z4 - MULTIPLY(z1, FIX(1.126980169)); /* c3+c5-c1 */ 03216 tmp14 = MULTIPLY(tmp14, FIX(0.752406978)); /* c9 */ 03217 tmp16 = tmp14 - MULTIPLY(z1, FIX(1.061150426)); /* c9+c11-c13 */ 03218 z1 -= z2; 03219 tmp15 = MULTIPLY(z1, FIX(0.467085129)) - z4; /* c11 */ 03220 tmp16 += tmp15; 03221 tmp13 = MULTIPLY(z2 + z3, - FIX(0.158341681)) - z4; /* -c13 */ 03222 tmp11 += tmp13 - MULTIPLY(z2, FIX(0.424103948)); /* c3-c9-c13 */ 03223 tmp12 += tmp13 - MULTIPLY(z3, FIX(2.373959773)); /* c3+c5-c13 */ 03224 tmp13 = MULTIPLY(z3 - z2, FIX(1.405321284)); /* c1 */ 03225 tmp14 += tmp13 + z4 - MULTIPLY(z3, FIX(1.6906431334)); /* c1+c9-c11 */ 03226 tmp15 += tmp13 + MULTIPLY(z2, FIX(0.674957567)); /* c1+c11-c5 */ 03227 03228 tmp13 = ((z1 - z3) << CONST_BITS) + z4; 03229 03230 /* Final output stage */ 03231 03232 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp20 + tmp10, 03233 CONST_BITS+PASS1_BITS+3) 03234 & RANGE_MASK]; 03235 outptr[13] = range_limit[(int) RIGHT_SHIFT(tmp20 - tmp10, 03236 CONST_BITS+PASS1_BITS+3) 03237 & RANGE_MASK]; 03238 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp21 + tmp11, 03239 CONST_BITS+PASS1_BITS+3) 03240 & RANGE_MASK]; 03241 outptr[12] = range_limit[(int) RIGHT_SHIFT(tmp21 - tmp11, 03242 CONST_BITS+PASS1_BITS+3) 03243 & RANGE_MASK]; 03244 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp22 + tmp12, 03245 CONST_BITS+PASS1_BITS+3) 03246 & RANGE_MASK]; 03247 outptr[11] = range_limit[(int) RIGHT_SHIFT(tmp22 - tmp12, 03248 CONST_BITS+PASS1_BITS+3) 03249 & RANGE_MASK]; 03250 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp23 + tmp13, 03251 CONST_BITS+PASS1_BITS+3) 03252 & RANGE_MASK]; 03253 outptr[10] = range_limit[(int) RIGHT_SHIFT(tmp23 - tmp13, 03254 CONST_BITS+PASS1_BITS+3) 03255 & RANGE_MASK]; 03256 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp24 + tmp14, 03257 CONST_BITS+PASS1_BITS+3) 03258 & RANGE_MASK]; 03259 outptr[9] = range_limit[(int) RIGHT_SHIFT(tmp24 - tmp14, 03260 CONST_BITS+PASS1_BITS+3) 03261 & RANGE_MASK]; 03262 outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp25 + tmp15, 03263 CONST_BITS+PASS1_BITS+3) 03264 & RANGE_MASK]; 03265 outptr[8] = range_limit[(int) RIGHT_SHIFT(tmp25 - tmp15, 03266 CONST_BITS+PASS1_BITS+3) 03267 & RANGE_MASK]; 03268 outptr[6] = range_limit[(int) RIGHT_SHIFT(tmp26 + tmp16, 03269 CONST_BITS+PASS1_BITS+3) 03270 & RANGE_MASK]; 03271 outptr[7] = range_limit[(int) RIGHT_SHIFT(tmp26 - tmp16, 03272 CONST_BITS+PASS1_BITS+3) 03273 & RANGE_MASK]; 03274 03275 wsptr += 8; /* advance pointer to next row */ 03276 } 03277 } 03278 03279 03280 /* 03281 * Perform dequantization and inverse DCT on one block of coefficients, 03282 * producing a 12x6 output block. 03283 * 03284 * 6-point IDCT in pass 1 (columns), 12-point in pass 2 (rows). 03285 */ 03286 03287 GLOBAL(void) 03288 jpeg_idct_12x6 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 03289 JCOEFPTR coef_block, 03290 JSAMPARRAY output_buf, JDIMENSION output_col) 03291 { 03292 INT32 tmp10, tmp11, tmp12, tmp13, tmp14, tmp15; 03293 INT32 tmp20, tmp21, tmp22, tmp23, tmp24, tmp25; 03294 INT32 z1, z2, z3, z4; 03295 JCOEFPTR inptr; 03296 ISLOW_MULT_TYPE * quantptr; 03297 int * wsptr; 03298 JSAMPROW outptr; 03299 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 03300 int ctr; 03301 int workspace[8*6]; /* buffers data between passes */ 03302 SHIFT_TEMPS 03303 03304 /* Pass 1: process columns from input, store into work array. 03305 * 6-point IDCT kernel, cK represents sqrt(2) * cos(K*pi/12). 03306 */ 03307 inptr = coef_block; 03308 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 03309 wsptr = workspace; 03310 for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) { 03311 /* Even part */ 03312 03313 tmp10 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 03314 tmp10 <<= CONST_BITS; 03315 /* Add fudge factor here for final descale. */ 03316 tmp10 += ONE << (CONST_BITS-PASS1_BITS-1); 03317 tmp12 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 03318 tmp20 = MULTIPLY(tmp12, FIX(0.707106781)); /* c4 */ 03319 tmp11 = tmp10 + tmp20; 03320 tmp21 = RIGHT_SHIFT(tmp10 - tmp20 - tmp20, CONST_BITS-PASS1_BITS); 03321 tmp20 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 03322 tmp10 = MULTIPLY(tmp20, FIX(1.224744871)); /* c2 */ 03323 tmp20 = tmp11 + tmp10; 03324 tmp22 = tmp11 - tmp10; 03325 03326 /* Odd part */ 03327 03328 z1 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 03329 z2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 03330 z3 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]); 03331 tmp11 = MULTIPLY(z1 + z3, FIX(0.366025404)); /* c5 */ 03332 tmp10 = tmp11 + ((z1 + z2) << CONST_BITS); 03333 tmp12 = tmp11 + ((z3 - z2) << CONST_BITS); 03334 tmp11 = (z1 - z2 - z3) << PASS1_BITS; 03335 03336 /* Final output stage */ 03337 03338 wsptr[8*0] = (int) RIGHT_SHIFT(tmp20 + tmp10, CONST_BITS-PASS1_BITS); 03339 wsptr[8*5] = (int) RIGHT_SHIFT(tmp20 - tmp10, CONST_BITS-PASS1_BITS); 03340 wsptr[8*1] = (int) (tmp21 + tmp11); 03341 wsptr[8*4] = (int) (tmp21 - tmp11); 03342 wsptr[8*2] = (int) RIGHT_SHIFT(tmp22 + tmp12, CONST_BITS-PASS1_BITS); 03343 wsptr[8*3] = (int) RIGHT_SHIFT(tmp22 - tmp12, CONST_BITS-PASS1_BITS); 03344 } 03345 03346 /* Pass 2: process 6 rows from work array, store into output array. 03347 * 12-point IDCT kernel, cK represents sqrt(2) * cos(K*pi/24). 03348 */ 03349 wsptr = workspace; 03350 for (ctr = 0; ctr < 6; ctr++) { 03351 outptr = output_buf[ctr] + output_col; 03352 03353 /* Even part */ 03354 03355 /* Add fudge factor here for final descale. */ 03356 z3 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 03357 z3 <<= CONST_BITS; 03358 03359 z4 = (INT32) wsptr[4]; 03360 z4 = MULTIPLY(z4, FIX(1.224744871)); /* c4 */ 03361 03362 tmp10 = z3 + z4; 03363 tmp11 = z3 - z4; 03364 03365 z1 = (INT32) wsptr[2]; 03366 z4 = MULTIPLY(z1, FIX(1.366025404)); /* c2 */ 03367 z1 <<= CONST_BITS; 03368 z2 = (INT32) wsptr[6]; 03369 z2 <<= CONST_BITS; 03370 03371 tmp12 = z1 - z2; 03372 03373 tmp21 = z3 + tmp12; 03374 tmp24 = z3 - tmp12; 03375 03376 tmp12 = z4 + z2; 03377 03378 tmp20 = tmp10 + tmp12; 03379 tmp25 = tmp10 - tmp12; 03380 03381 tmp12 = z4 - z1 - z2; 03382 03383 tmp22 = tmp11 + tmp12; 03384 tmp23 = tmp11 - tmp12; 03385 03386 /* Odd part */ 03387 03388 z1 = (INT32) wsptr[1]; 03389 z2 = (INT32) wsptr[3]; 03390 z3 = (INT32) wsptr[5]; 03391 z4 = (INT32) wsptr[7]; 03392 03393 tmp11 = MULTIPLY(z2, FIX(1.306562965)); /* c3 */ 03394 tmp14 = MULTIPLY(z2, - FIX_0_541196100); /* -c9 */ 03395 03396 tmp10 = z1 + z3; 03397 tmp15 = MULTIPLY(tmp10 + z4, FIX(0.860918669)); /* c7 */ 03398 tmp12 = tmp15 + MULTIPLY(tmp10, FIX(0.261052384)); /* c5-c7 */ 03399 tmp10 = tmp12 + tmp11 + MULTIPLY(z1, FIX(0.280143716)); /* c1-c5 */ 03400 tmp13 = MULTIPLY(z3 + z4, - FIX(1.045510580)); /* -(c7+c11) */ 03401 tmp12 += tmp13 + tmp14 - MULTIPLY(z3, FIX(1.478575242)); /* c1+c5-c7-c11 */ 03402 tmp13 += tmp15 - tmp11 + MULTIPLY(z4, FIX(1.586706681)); /* c1+c11 */ 03403 tmp15 += tmp14 - MULTIPLY(z1, FIX(0.676326758)) - /* c7-c11 */ 03404 MULTIPLY(z4, FIX(1.982889723)); /* c5+c7 */ 03405 03406 z1 -= z4; 03407 z2 -= z3; 03408 z3 = MULTIPLY(z1 + z2, FIX_0_541196100); /* c9 */ 03409 tmp11 = z3 + MULTIPLY(z1, FIX_0_765366865); /* c3-c9 */ 03410 tmp14 = z3 - MULTIPLY(z2, FIX_1_847759065); /* c3+c9 */ 03411 03412 /* Final output stage */ 03413 03414 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp20 + tmp10, 03415 CONST_BITS+PASS1_BITS+3) 03416 & RANGE_MASK]; 03417 outptr[11] = range_limit[(int) RIGHT_SHIFT(tmp20 - tmp10, 03418 CONST_BITS+PASS1_BITS+3) 03419 & RANGE_MASK]; 03420 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp21 + tmp11, 03421 CONST_BITS+PASS1_BITS+3) 03422 & RANGE_MASK]; 03423 outptr[10] = range_limit[(int) RIGHT_SHIFT(tmp21 - tmp11, 03424 CONST_BITS+PASS1_BITS+3) 03425 & RANGE_MASK]; 03426 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp22 + tmp12, 03427 CONST_BITS+PASS1_BITS+3) 03428 & RANGE_MASK]; 03429 outptr[9] = range_limit[(int) RIGHT_SHIFT(tmp22 - tmp12, 03430 CONST_BITS+PASS1_BITS+3) 03431 & RANGE_MASK]; 03432 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp23 + tmp13, 03433 CONST_BITS+PASS1_BITS+3) 03434 & RANGE_MASK]; 03435 outptr[8] = range_limit[(int) RIGHT_SHIFT(tmp23 - tmp13, 03436 CONST_BITS+PASS1_BITS+3) 03437 & RANGE_MASK]; 03438 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp24 + tmp14, 03439 CONST_BITS+PASS1_BITS+3) 03440 & RANGE_MASK]; 03441 outptr[7] = range_limit[(int) RIGHT_SHIFT(tmp24 - tmp14, 03442 CONST_BITS+PASS1_BITS+3) 03443 & RANGE_MASK]; 03444 outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp25 + tmp15, 03445 CONST_BITS+PASS1_BITS+3) 03446 & RANGE_MASK]; 03447 outptr[6] = range_limit[(int) RIGHT_SHIFT(tmp25 - tmp15, 03448 CONST_BITS+PASS1_BITS+3) 03449 & RANGE_MASK]; 03450 03451 wsptr += 8; /* advance pointer to next row */ 03452 } 03453 } 03454 03455 03456 /* 03457 * Perform dequantization and inverse DCT on one block of coefficients, 03458 * producing a 10x5 output block. 03459 * 03460 * 5-point IDCT in pass 1 (columns), 10-point in pass 2 (rows). 03461 */ 03462 03463 GLOBAL(void) 03464 jpeg_idct_10x5 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 03465 JCOEFPTR coef_block, 03466 JSAMPARRAY output_buf, JDIMENSION output_col) 03467 { 03468 INT32 tmp10, tmp11, tmp12, tmp13, tmp14; 03469 INT32 tmp20, tmp21, tmp22, tmp23, tmp24; 03470 INT32 z1, z2, z3, z4; 03471 JCOEFPTR inptr; 03472 ISLOW_MULT_TYPE * quantptr; 03473 int * wsptr; 03474 JSAMPROW outptr; 03475 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 03476 int ctr; 03477 int workspace[8*5]; /* buffers data between passes */ 03478 SHIFT_TEMPS 03479 03480 /* Pass 1: process columns from input, store into work array. 03481 * 5-point IDCT kernel, cK represents sqrt(2) * cos(K*pi/10). 03482 */ 03483 inptr = coef_block; 03484 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 03485 wsptr = workspace; 03486 for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) { 03487 /* Even part */ 03488 03489 tmp12 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 03490 tmp12 <<= CONST_BITS; 03491 /* Add fudge factor here for final descale. */ 03492 tmp12 += ONE << (CONST_BITS-PASS1_BITS-1); 03493 tmp13 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 03494 tmp14 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 03495 z1 = MULTIPLY(tmp13 + tmp14, FIX(0.790569415)); /* (c2+c4)/2 */ 03496 z2 = MULTIPLY(tmp13 - tmp14, FIX(0.353553391)); /* (c2-c4)/2 */ 03497 z3 = tmp12 + z2; 03498 tmp10 = z3 + z1; 03499 tmp11 = z3 - z1; 03500 tmp12 -= z2 << 2; 03501 03502 /* Odd part */ 03503 03504 z2 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 03505 z3 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 03506 03507 z1 = MULTIPLY(z2 + z3, FIX(0.831253876)); /* c3 */ 03508 tmp13 = z1 + MULTIPLY(z2, FIX(0.513743148)); /* c1-c3 */ 03509 tmp14 = z1 - MULTIPLY(z3, FIX(2.176250899)); /* c1+c3 */ 03510 03511 /* Final output stage */ 03512 03513 wsptr[8*0] = (int) RIGHT_SHIFT(tmp10 + tmp13, CONST_BITS-PASS1_BITS); 03514 wsptr[8*4] = (int) RIGHT_SHIFT(tmp10 - tmp13, CONST_BITS-PASS1_BITS); 03515 wsptr[8*1] = (int) RIGHT_SHIFT(tmp11 + tmp14, CONST_BITS-PASS1_BITS); 03516 wsptr[8*3] = (int) RIGHT_SHIFT(tmp11 - tmp14, CONST_BITS-PASS1_BITS); 03517 wsptr[8*2] = (int) RIGHT_SHIFT(tmp12, CONST_BITS-PASS1_BITS); 03518 } 03519 03520 /* Pass 2: process 5 rows from work array, store into output array. 03521 * 10-point IDCT kernel, cK represents sqrt(2) * cos(K*pi/20). 03522 */ 03523 wsptr = workspace; 03524 for (ctr = 0; ctr < 5; ctr++) { 03525 outptr = output_buf[ctr] + output_col; 03526 03527 /* Even part */ 03528 03529 /* Add fudge factor here for final descale. */ 03530 z3 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 03531 z3 <<= CONST_BITS; 03532 z4 = (INT32) wsptr[4]; 03533 z1 = MULTIPLY(z4, FIX(1.144122806)); /* c4 */ 03534 z2 = MULTIPLY(z4, FIX(0.437016024)); /* c8 */ 03535 tmp10 = z3 + z1; 03536 tmp11 = z3 - z2; 03537 03538 tmp22 = z3 - ((z1 - z2) << 1); /* c0 = (c4-c8)*2 */ 03539 03540 z2 = (INT32) wsptr[2]; 03541 z3 = (INT32) wsptr[6]; 03542 03543 z1 = MULTIPLY(z2 + z3, FIX(0.831253876)); /* c6 */ 03544 tmp12 = z1 + MULTIPLY(z2, FIX(0.513743148)); /* c2-c6 */ 03545 tmp13 = z1 - MULTIPLY(z3, FIX(2.176250899)); /* c2+c6 */ 03546 03547 tmp20 = tmp10 + tmp12; 03548 tmp24 = tmp10 - tmp12; 03549 tmp21 = tmp11 + tmp13; 03550 tmp23 = tmp11 - tmp13; 03551 03552 /* Odd part */ 03553 03554 z1 = (INT32) wsptr[1]; 03555 z2 = (INT32) wsptr[3]; 03556 z3 = (INT32) wsptr[5]; 03557 z3 <<= CONST_BITS; 03558 z4 = (INT32) wsptr[7]; 03559 03560 tmp11 = z2 + z4; 03561 tmp13 = z2 - z4; 03562 03563 tmp12 = MULTIPLY(tmp13, FIX(0.309016994)); /* (c3-c7)/2 */ 03564 03565 z2 = MULTIPLY(tmp11, FIX(0.951056516)); /* (c3+c7)/2 */ 03566 z4 = z3 + tmp12; 03567 03568 tmp10 = MULTIPLY(z1, FIX(1.396802247)) + z2 + z4; /* c1 */ 03569 tmp14 = MULTIPLY(z1, FIX(0.221231742)) - z2 + z4; /* c9 */ 03570 03571 z2 = MULTIPLY(tmp11, FIX(0.587785252)); /* (c1-c9)/2 */ 03572 z4 = z3 - tmp12 - (tmp13 << (CONST_BITS - 1)); 03573 03574 tmp12 = ((z1 - tmp13) << CONST_BITS) - z3; 03575 03576 tmp11 = MULTIPLY(z1, FIX(1.260073511)) - z2 - z4; /* c3 */ 03577 tmp13 = MULTIPLY(z1, FIX(0.642039522)) - z2 + z4; /* c7 */ 03578 03579 /* Final output stage */ 03580 03581 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp20 + tmp10, 03582 CONST_BITS+PASS1_BITS+3) 03583 & RANGE_MASK]; 03584 outptr[9] = range_limit[(int) RIGHT_SHIFT(tmp20 - tmp10, 03585 CONST_BITS+PASS1_BITS+3) 03586 & RANGE_MASK]; 03587 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp21 + tmp11, 03588 CONST_BITS+PASS1_BITS+3) 03589 & RANGE_MASK]; 03590 outptr[8] = range_limit[(int) RIGHT_SHIFT(tmp21 - tmp11, 03591 CONST_BITS+PASS1_BITS+3) 03592 & RANGE_MASK]; 03593 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp22 + tmp12, 03594 CONST_BITS+PASS1_BITS+3) 03595 & RANGE_MASK]; 03596 outptr[7] = range_limit[(int) RIGHT_SHIFT(tmp22 - tmp12, 03597 CONST_BITS+PASS1_BITS+3) 03598 & RANGE_MASK]; 03599 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp23 + tmp13, 03600 CONST_BITS+PASS1_BITS+3) 03601 & RANGE_MASK]; 03602 outptr[6] = range_limit[(int) RIGHT_SHIFT(tmp23 - tmp13, 03603 CONST_BITS+PASS1_BITS+3) 03604 & RANGE_MASK]; 03605 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp24 + tmp14, 03606 CONST_BITS+PASS1_BITS+3) 03607 & RANGE_MASK]; 03608 outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp24 - tmp14, 03609 CONST_BITS+PASS1_BITS+3) 03610 & RANGE_MASK]; 03611 03612 wsptr += 8; /* advance pointer to next row */ 03613 } 03614 } 03615 03616 03617 /* 03618 * Perform dequantization and inverse DCT on one block of coefficients, 03619 * producing a 8x4 output block. 03620 * 03621 * 4-point IDCT in pass 1 (columns), 8-point in pass 2 (rows). 03622 */ 03623 03624 GLOBAL(void) 03625 jpeg_idct_8x4 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 03626 JCOEFPTR coef_block, 03627 JSAMPARRAY output_buf, JDIMENSION output_col) 03628 { 03629 INT32 tmp0, tmp1, tmp2, tmp3; 03630 INT32 tmp10, tmp11, tmp12, tmp13; 03631 INT32 z1, z2, z3; 03632 JCOEFPTR inptr; 03633 ISLOW_MULT_TYPE * quantptr; 03634 int * wsptr; 03635 JSAMPROW outptr; 03636 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 03637 int ctr; 03638 int workspace[8*4]; /* buffers data between passes */ 03639 SHIFT_TEMPS 03640 03641 /* Pass 1: process columns from input, store into work array. 03642 * 4-point IDCT kernel, cK represents sqrt(2) * cos(K*pi/16). 03643 */ 03644 inptr = coef_block; 03645 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 03646 wsptr = workspace; 03647 for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) { 03648 /* Even part */ 03649 03650 tmp0 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 03651 tmp2 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 03652 03653 tmp10 = (tmp0 + tmp2) << PASS1_BITS; 03654 tmp12 = (tmp0 - tmp2) << PASS1_BITS; 03655 03656 /* Odd part */ 03657 /* Same rotation as in the even part of the 8x8 LL&M IDCT */ 03658 03659 z2 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 03660 z3 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 03661 03662 z1 = MULTIPLY(z2 + z3, FIX_0_541196100); /* c6 */ 03663 /* Add fudge factor here for final descale. */ 03664 z1 += ONE << (CONST_BITS-PASS1_BITS-1); 03665 tmp0 = RIGHT_SHIFT(z1 + MULTIPLY(z2, FIX_0_765366865), /* c2-c6 */ 03666 CONST_BITS-PASS1_BITS); 03667 tmp2 = RIGHT_SHIFT(z1 - MULTIPLY(z3, FIX_1_847759065), /* c2+c6 */ 03668 CONST_BITS-PASS1_BITS); 03669 03670 /* Final output stage */ 03671 03672 wsptr[8*0] = (int) (tmp10 + tmp0); 03673 wsptr[8*3] = (int) (tmp10 - tmp0); 03674 wsptr[8*1] = (int) (tmp12 + tmp2); 03675 wsptr[8*2] = (int) (tmp12 - tmp2); 03676 } 03677 03678 /* Pass 2: process rows from work array, store into output array. */ 03679 /* Note that we must descale the results by a factor of 8 == 2**3, */ 03680 /* and also undo the PASS1_BITS scaling. */ 03681 03682 wsptr = workspace; 03683 for (ctr = 0; ctr < 4; ctr++) { 03684 outptr = output_buf[ctr] + output_col; 03685 03686 /* Even part: reverse the even part of the forward DCT. */ 03687 /* The rotator is sqrt(2)*c(-6). */ 03688 03689 z2 = (INT32) wsptr[2]; 03690 z3 = (INT32) wsptr[6]; 03691 03692 z1 = MULTIPLY(z2 + z3, FIX_0_541196100); 03693 tmp2 = z1 + MULTIPLY(z2, FIX_0_765366865); 03694 tmp3 = z1 - MULTIPLY(z3, FIX_1_847759065); 03695 03696 /* Add fudge factor here for final descale. */ 03697 z2 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 03698 z3 = (INT32) wsptr[4]; 03699 03700 tmp0 = (z2 + z3) << CONST_BITS; 03701 tmp1 = (z2 - z3) << CONST_BITS; 03702 03703 tmp10 = tmp0 + tmp2; 03704 tmp13 = tmp0 - tmp2; 03705 tmp11 = tmp1 + tmp3; 03706 tmp12 = tmp1 - tmp3; 03707 03708 /* Odd part per figure 8; the matrix is unitary and hence its 03709 * transpose is its inverse. i0..i3 are y7,y5,y3,y1 respectively. 03710 */ 03711 03712 tmp0 = (INT32) wsptr[7]; 03713 tmp1 = (INT32) wsptr[5]; 03714 tmp2 = (INT32) wsptr[3]; 03715 tmp3 = (INT32) wsptr[1]; 03716 03717 z2 = tmp0 + tmp2; 03718 z3 = tmp1 + tmp3; 03719 03720 z1 = MULTIPLY(z2 + z3, FIX_1_175875602); /* sqrt(2) * c3 */ 03721 z2 = MULTIPLY(z2, - FIX_1_961570560); /* sqrt(2) * (-c3-c5) */ 03722 z3 = MULTIPLY(z3, - FIX_0_390180644); /* sqrt(2) * (c5-c3) */ 03723 z2 += z1; 03724 z3 += z1; 03725 03726 z1 = MULTIPLY(tmp0 + tmp3, - FIX_0_899976223); /* sqrt(2) * (c7-c3) */ 03727 tmp0 = MULTIPLY(tmp0, FIX_0_298631336); /* sqrt(2) * (-c1+c3+c5-c7) */ 03728 tmp3 = MULTIPLY(tmp3, FIX_1_501321110); /* sqrt(2) * ( c1+c3-c5-c7) */ 03729 tmp0 += z1 + z2; 03730 tmp3 += z1 + z3; 03731 03732 z1 = MULTIPLY(tmp1 + tmp2, - FIX_2_562915447); /* sqrt(2) * (-c1-c3) */ 03733 tmp1 = MULTIPLY(tmp1, FIX_2_053119869); /* sqrt(2) * ( c1+c3-c5+c7) */ 03734 tmp2 = MULTIPLY(tmp2, FIX_3_072711026); /* sqrt(2) * ( c1+c3+c5-c7) */ 03735 tmp1 += z1 + z3; 03736 tmp2 += z1 + z2; 03737 03738 /* Final output stage: inputs are tmp10..tmp13, tmp0..tmp3 */ 03739 03740 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp10 + tmp3, 03741 CONST_BITS+PASS1_BITS+3) 03742 & RANGE_MASK]; 03743 outptr[7] = range_limit[(int) RIGHT_SHIFT(tmp10 - tmp3, 03744 CONST_BITS+PASS1_BITS+3) 03745 & RANGE_MASK]; 03746 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp11 + tmp2, 03747 CONST_BITS+PASS1_BITS+3) 03748 & RANGE_MASK]; 03749 outptr[6] = range_limit[(int) RIGHT_SHIFT(tmp11 - tmp2, 03750 CONST_BITS+PASS1_BITS+3) 03751 & RANGE_MASK]; 03752 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp12 + tmp1, 03753 CONST_BITS+PASS1_BITS+3) 03754 & RANGE_MASK]; 03755 outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp12 - tmp1, 03756 CONST_BITS+PASS1_BITS+3) 03757 & RANGE_MASK]; 03758 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp13 + tmp0, 03759 CONST_BITS+PASS1_BITS+3) 03760 & RANGE_MASK]; 03761 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp13 - tmp0, 03762 CONST_BITS+PASS1_BITS+3) 03763 & RANGE_MASK]; 03764 03765 wsptr += DCTSIZE; /* advance pointer to next row */ 03766 } 03767 } 03768 03769 03770 /* 03771 * Perform dequantization and inverse DCT on one block of coefficients, 03772 * producing a reduced-size 6x3 output block. 03773 * 03774 * 3-point IDCT in pass 1 (columns), 6-point in pass 2 (rows). 03775 */ 03776 03777 GLOBAL(void) 03778 jpeg_idct_6x3 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 03779 JCOEFPTR coef_block, 03780 JSAMPARRAY output_buf, JDIMENSION output_col) 03781 { 03782 INT32 tmp0, tmp1, tmp2, tmp10, tmp11, tmp12; 03783 INT32 z1, z2, z3; 03784 JCOEFPTR inptr; 03785 ISLOW_MULT_TYPE * quantptr; 03786 int * wsptr; 03787 JSAMPROW outptr; 03788 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 03789 int ctr; 03790 int workspace[6*3]; /* buffers data between passes */ 03791 SHIFT_TEMPS 03792 03793 /* Pass 1: process columns from input, store into work array. 03794 * 3-point IDCT kernel, cK represents sqrt(2) * cos(K*pi/6). 03795 */ 03796 inptr = coef_block; 03797 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 03798 wsptr = workspace; 03799 for (ctr = 0; ctr < 6; ctr++, inptr++, quantptr++, wsptr++) { 03800 /* Even part */ 03801 03802 tmp0 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 03803 tmp0 <<= CONST_BITS; 03804 /* Add fudge factor here for final descale. */ 03805 tmp0 += ONE << (CONST_BITS-PASS1_BITS-1); 03806 tmp2 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 03807 tmp12 = MULTIPLY(tmp2, FIX(0.707106781)); /* c2 */ 03808 tmp10 = tmp0 + tmp12; 03809 tmp2 = tmp0 - tmp12 - tmp12; 03810 03811 /* Odd part */ 03812 03813 tmp12 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 03814 tmp0 = MULTIPLY(tmp12, FIX(1.224744871)); /* c1 */ 03815 03816 /* Final output stage */ 03817 03818 wsptr[6*0] = (int) RIGHT_SHIFT(tmp10 + tmp0, CONST_BITS-PASS1_BITS); 03819 wsptr[6*2] = (int) RIGHT_SHIFT(tmp10 - tmp0, CONST_BITS-PASS1_BITS); 03820 wsptr[6*1] = (int) RIGHT_SHIFT(tmp2, CONST_BITS-PASS1_BITS); 03821 } 03822 03823 /* Pass 2: process 3 rows from work array, store into output array. 03824 * 6-point IDCT kernel, cK represents sqrt(2) * cos(K*pi/12). 03825 */ 03826 wsptr = workspace; 03827 for (ctr = 0; ctr < 3; ctr++) { 03828 outptr = output_buf[ctr] + output_col; 03829 03830 /* Even part */ 03831 03832 /* Add fudge factor here for final descale. */ 03833 tmp0 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 03834 tmp0 <<= CONST_BITS; 03835 tmp2 = (INT32) wsptr[4]; 03836 tmp10 = MULTIPLY(tmp2, FIX(0.707106781)); /* c4 */ 03837 tmp1 = tmp0 + tmp10; 03838 tmp11 = tmp0 - tmp10 - tmp10; 03839 tmp10 = (INT32) wsptr[2]; 03840 tmp0 = MULTIPLY(tmp10, FIX(1.224744871)); /* c2 */ 03841 tmp10 = tmp1 + tmp0; 03842 tmp12 = tmp1 - tmp0; 03843 03844 /* Odd part */ 03845 03846 z1 = (INT32) wsptr[1]; 03847 z2 = (INT32) wsptr[3]; 03848 z3 = (INT32) wsptr[5]; 03849 tmp1 = MULTIPLY(z1 + z3, FIX(0.366025404)); /* c5 */ 03850 tmp0 = tmp1 + ((z1 + z2) << CONST_BITS); 03851 tmp2 = tmp1 + ((z3 - z2) << CONST_BITS); 03852 tmp1 = (z1 - z2 - z3) << CONST_BITS; 03853 03854 /* Final output stage */ 03855 03856 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp10 + tmp0, 03857 CONST_BITS+PASS1_BITS+3) 03858 & RANGE_MASK]; 03859 outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp10 - tmp0, 03860 CONST_BITS+PASS1_BITS+3) 03861 & RANGE_MASK]; 03862 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp11 + tmp1, 03863 CONST_BITS+PASS1_BITS+3) 03864 & RANGE_MASK]; 03865 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp11 - tmp1, 03866 CONST_BITS+PASS1_BITS+3) 03867 & RANGE_MASK]; 03868 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp12 + tmp2, 03869 CONST_BITS+PASS1_BITS+3) 03870 & RANGE_MASK]; 03871 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp12 - tmp2, 03872 CONST_BITS+PASS1_BITS+3) 03873 & RANGE_MASK]; 03874 03875 wsptr += 6; /* advance pointer to next row */ 03876 } 03877 } 03878 03879 03880 /* 03881 * Perform dequantization and inverse DCT on one block of coefficients, 03882 * producing a 4x2 output block. 03883 * 03884 * 2-point IDCT in pass 1 (columns), 4-point in pass 2 (rows). 03885 */ 03886 03887 GLOBAL(void) 03888 jpeg_idct_4x2 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 03889 JCOEFPTR coef_block, 03890 JSAMPARRAY output_buf, JDIMENSION output_col) 03891 { 03892 INT32 tmp0, tmp2, tmp10, tmp12; 03893 INT32 z1, z2, z3; 03894 JCOEFPTR inptr; 03895 ISLOW_MULT_TYPE * quantptr; 03896 INT32 * wsptr; 03897 JSAMPROW outptr; 03898 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 03899 int ctr; 03900 INT32 workspace[4*2]; /* buffers data between passes */ 03901 SHIFT_TEMPS 03902 03903 /* Pass 1: process columns from input, store into work array. */ 03904 03905 inptr = coef_block; 03906 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 03907 wsptr = workspace; 03908 for (ctr = 0; ctr < 4; ctr++, inptr++, quantptr++, wsptr++) { 03909 /* Even part */ 03910 03911 tmp10 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 03912 03913 /* Odd part */ 03914 03915 tmp0 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 03916 03917 /* Final output stage */ 03918 03919 wsptr[4*0] = tmp10 + tmp0; 03920 wsptr[4*1] = tmp10 - tmp0; 03921 } 03922 03923 /* Pass 2: process 2 rows from work array, store into output array. 03924 * 4-point IDCT kernel, 03925 * cK represents sqrt(2) * cos(K*pi/16) [refers to 8-point IDCT]. 03926 */ 03927 wsptr = workspace; 03928 for (ctr = 0; ctr < 2; ctr++) { 03929 outptr = output_buf[ctr] + output_col; 03930 03931 /* Even part */ 03932 03933 /* Add fudge factor here for final descale. */ 03934 tmp0 = wsptr[0] + (ONE << 2); 03935 tmp2 = wsptr[2]; 03936 03937 tmp10 = (tmp0 + tmp2) << CONST_BITS; 03938 tmp12 = (tmp0 - tmp2) << CONST_BITS; 03939 03940 /* Odd part */ 03941 /* Same rotation as in the even part of the 8x8 LL&M IDCT */ 03942 03943 z2 = wsptr[1]; 03944 z3 = wsptr[3]; 03945 03946 z1 = MULTIPLY(z2 + z3, FIX_0_541196100); /* c6 */ 03947 tmp0 = z1 + MULTIPLY(z2, FIX_0_765366865); /* c2-c6 */ 03948 tmp2 = z1 - MULTIPLY(z3, FIX_1_847759065); /* c2+c6 */ 03949 03950 /* Final output stage */ 03951 03952 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp10 + tmp0, 03953 CONST_BITS+3) 03954 & RANGE_MASK]; 03955 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp10 - tmp0, 03956 CONST_BITS+3) 03957 & RANGE_MASK]; 03958 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp12 + tmp2, 03959 CONST_BITS+3) 03960 & RANGE_MASK]; 03961 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp12 - tmp2, 03962 CONST_BITS+3) 03963 & RANGE_MASK]; 03964 03965 wsptr += 4; /* advance pointer to next row */ 03966 } 03967 } 03968 03969 03970 /* 03971 * Perform dequantization and inverse DCT on one block of coefficients, 03972 * producing a 2x1 output block. 03973 * 03974 * 1-point IDCT in pass 1 (columns), 2-point in pass 2 (rows). 03975 */ 03976 03977 GLOBAL(void) 03978 jpeg_idct_2x1 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 03979 JCOEFPTR coef_block, 03980 JSAMPARRAY output_buf, JDIMENSION output_col) 03981 { 03982 INT32 tmp0, tmp10; 03983 ISLOW_MULT_TYPE * quantptr; 03984 JSAMPROW outptr; 03985 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 03986 SHIFT_TEMPS 03987 03988 /* Pass 1: empty. */ 03989 03990 /* Pass 2: process 1 row from input, store into output array. */ 03991 03992 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 03993 outptr = output_buf[0] + output_col; 03994 03995 /* Even part */ 03996 03997 tmp10 = DEQUANTIZE(coef_block[0], quantptr[0]); 03998 /* Add fudge factor here for final descale. */ 03999 tmp10 += ONE << 2; 04000 04001 /* Odd part */ 04002 04003 tmp0 = DEQUANTIZE(coef_block[1], quantptr[1]); 04004 04005 /* Final output stage */ 04006 04007 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp10 + tmp0, 3) & RANGE_MASK]; 04008 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp10 - tmp0, 3) & RANGE_MASK]; 04009 } 04010 04011 04012 /* 04013 * Perform dequantization and inverse DCT on one block of coefficients, 04014 * producing a 8x16 output block. 04015 * 04016 * 16-point IDCT in pass 1 (columns), 8-point in pass 2 (rows). 04017 */ 04018 04019 GLOBAL(void) 04020 jpeg_idct_8x16 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 04021 JCOEFPTR coef_block, 04022 JSAMPARRAY output_buf, JDIMENSION output_col) 04023 { 04024 INT32 tmp0, tmp1, tmp2, tmp3, tmp10, tmp11, tmp12, tmp13; 04025 INT32 tmp20, tmp21, tmp22, tmp23, tmp24, tmp25, tmp26, tmp27; 04026 INT32 z1, z2, z3, z4; 04027 JCOEFPTR inptr; 04028 ISLOW_MULT_TYPE * quantptr; 04029 int * wsptr; 04030 JSAMPROW outptr; 04031 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 04032 int ctr; 04033 int workspace[8*16]; /* buffers data between passes */ 04034 SHIFT_TEMPS 04035 04036 /* Pass 1: process columns from input, store into work array. 04037 * 16-point IDCT kernel, cK represents sqrt(2) * cos(K*pi/32). 04038 */ 04039 inptr = coef_block; 04040 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 04041 wsptr = workspace; 04042 for (ctr = 0; ctr < 8; ctr++, inptr++, quantptr++, wsptr++) { 04043 /* Even part */ 04044 04045 tmp0 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 04046 tmp0 <<= CONST_BITS; 04047 /* Add fudge factor here for final descale. */ 04048 tmp0 += ONE << (CONST_BITS-PASS1_BITS-1); 04049 04050 z1 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 04051 tmp1 = MULTIPLY(z1, FIX(1.306562965)); /* c4[16] = c2[8] */ 04052 tmp2 = MULTIPLY(z1, FIX_0_541196100); /* c12[16] = c6[8] */ 04053 04054 tmp10 = tmp0 + tmp1; 04055 tmp11 = tmp0 - tmp1; 04056 tmp12 = tmp0 + tmp2; 04057 tmp13 = tmp0 - tmp2; 04058 04059 z1 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 04060 z2 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]); 04061 z3 = z1 - z2; 04062 z4 = MULTIPLY(z3, FIX(0.275899379)); /* c14[16] = c7[8] */ 04063 z3 = MULTIPLY(z3, FIX(1.387039845)); /* c2[16] = c1[8] */ 04064 04065 tmp0 = z3 + MULTIPLY(z2, FIX_2_562915447); /* (c6+c2)[16] = (c3+c1)[8] */ 04066 tmp1 = z4 + MULTIPLY(z1, FIX_0_899976223); /* (c6-c14)[16] = (c3-c7)[8] */ 04067 tmp2 = z3 - MULTIPLY(z1, FIX(0.601344887)); /* (c2-c10)[16] = (c1-c5)[8] */ 04068 tmp3 = z4 - MULTIPLY(z2, FIX(0.509795579)); /* (c10-c14)[16] = (c5-c7)[8] */ 04069 04070 tmp20 = tmp10 + tmp0; 04071 tmp27 = tmp10 - tmp0; 04072 tmp21 = tmp12 + tmp1; 04073 tmp26 = tmp12 - tmp1; 04074 tmp22 = tmp13 + tmp2; 04075 tmp25 = tmp13 - tmp2; 04076 tmp23 = tmp11 + tmp3; 04077 tmp24 = tmp11 - tmp3; 04078 04079 /* Odd part */ 04080 04081 z1 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 04082 z2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 04083 z3 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]); 04084 z4 = DEQUANTIZE(inptr[DCTSIZE*7], quantptr[DCTSIZE*7]); 04085 04086 tmp11 = z1 + z3; 04087 04088 tmp1 = MULTIPLY(z1 + z2, FIX(1.353318001)); /* c3 */ 04089 tmp2 = MULTIPLY(tmp11, FIX(1.247225013)); /* c5 */ 04090 tmp3 = MULTIPLY(z1 + z4, FIX(1.093201867)); /* c7 */ 04091 tmp10 = MULTIPLY(z1 - z4, FIX(0.897167586)); /* c9 */ 04092 tmp11 = MULTIPLY(tmp11, FIX(0.666655658)); /* c11 */ 04093 tmp12 = MULTIPLY(z1 - z2, FIX(0.410524528)); /* c13 */ 04094 tmp0 = tmp1 + tmp2 + tmp3 - 04095 MULTIPLY(z1, FIX(2.286341144)); /* c7+c5+c3-c1 */ 04096 tmp13 = tmp10 + tmp11 + tmp12 - 04097 MULTIPLY(z1, FIX(1.835730603)); /* c9+c11+c13-c15 */ 04098 z1 = MULTIPLY(z2 + z3, FIX(0.138617169)); /* c15 */ 04099 tmp1 += z1 + MULTIPLY(z2, FIX(0.071888074)); /* c9+c11-c3-c15 */ 04100 tmp2 += z1 - MULTIPLY(z3, FIX(1.125726048)); /* c5+c7+c15-c3 */ 04101 z1 = MULTIPLY(z3 - z2, FIX(1.407403738)); /* c1 */ 04102 tmp11 += z1 - MULTIPLY(z3, FIX(0.766367282)); /* c1+c11-c9-c13 */ 04103 tmp12 += z1 + MULTIPLY(z2, FIX(1.971951411)); /* c1+c5+c13-c7 */ 04104 z2 += z4; 04105 z1 = MULTIPLY(z2, - FIX(0.666655658)); /* -c11 */ 04106 tmp1 += z1; 04107 tmp3 += z1 + MULTIPLY(z4, FIX(1.065388962)); /* c3+c11+c15-c7 */ 04108 z2 = MULTIPLY(z2, - FIX(1.247225013)); /* -c5 */ 04109 tmp10 += z2 + MULTIPLY(z4, FIX(3.141271809)); /* c1+c5+c9-c13 */ 04110 tmp12 += z2; 04111 z2 = MULTIPLY(z3 + z4, - FIX(1.353318001)); /* -c3 */ 04112 tmp2 += z2; 04113 tmp3 += z2; 04114 z2 = MULTIPLY(z4 - z3, FIX(0.410524528)); /* c13 */ 04115 tmp10 += z2; 04116 tmp11 += z2; 04117 04118 /* Final output stage */ 04119 04120 wsptr[8*0] = (int) RIGHT_SHIFT(tmp20 + tmp0, CONST_BITS-PASS1_BITS); 04121 wsptr[8*15] = (int) RIGHT_SHIFT(tmp20 - tmp0, CONST_BITS-PASS1_BITS); 04122 wsptr[8*1] = (int) RIGHT_SHIFT(tmp21 + tmp1, CONST_BITS-PASS1_BITS); 04123 wsptr[8*14] = (int) RIGHT_SHIFT(tmp21 - tmp1, CONST_BITS-PASS1_BITS); 04124 wsptr[8*2] = (int) RIGHT_SHIFT(tmp22 + tmp2, CONST_BITS-PASS1_BITS); 04125 wsptr[8*13] = (int) RIGHT_SHIFT(tmp22 - tmp2, CONST_BITS-PASS1_BITS); 04126 wsptr[8*3] = (int) RIGHT_SHIFT(tmp23 + tmp3, CONST_BITS-PASS1_BITS); 04127 wsptr[8*12] = (int) RIGHT_SHIFT(tmp23 - tmp3, CONST_BITS-PASS1_BITS); 04128 wsptr[8*4] = (int) RIGHT_SHIFT(tmp24 + tmp10, CONST_BITS-PASS1_BITS); 04129 wsptr[8*11] = (int) RIGHT_SHIFT(tmp24 - tmp10, CONST_BITS-PASS1_BITS); 04130 wsptr[8*5] = (int) RIGHT_SHIFT(tmp25 + tmp11, CONST_BITS-PASS1_BITS); 04131 wsptr[8*10] = (int) RIGHT_SHIFT(tmp25 - tmp11, CONST_BITS-PASS1_BITS); 04132 wsptr[8*6] = (int) RIGHT_SHIFT(tmp26 + tmp12, CONST_BITS-PASS1_BITS); 04133 wsptr[8*9] = (int) RIGHT_SHIFT(tmp26 - tmp12, CONST_BITS-PASS1_BITS); 04134 wsptr[8*7] = (int) RIGHT_SHIFT(tmp27 + tmp13, CONST_BITS-PASS1_BITS); 04135 wsptr[8*8] = (int) RIGHT_SHIFT(tmp27 - tmp13, CONST_BITS-PASS1_BITS); 04136 } 04137 04138 /* Pass 2: process rows from work array, store into output array. */ 04139 /* Note that we must descale the results by a factor of 8 == 2**3, */ 04140 /* and also undo the PASS1_BITS scaling. */ 04141 04142 wsptr = workspace; 04143 for (ctr = 0; ctr < 16; ctr++) { 04144 outptr = output_buf[ctr] + output_col; 04145 04146 /* Even part: reverse the even part of the forward DCT. */ 04147 /* The rotator is sqrt(2)*c(-6). */ 04148 04149 z2 = (INT32) wsptr[2]; 04150 z3 = (INT32) wsptr[6]; 04151 04152 z1 = MULTIPLY(z2 + z3, FIX_0_541196100); 04153 tmp2 = z1 + MULTIPLY(z2, FIX_0_765366865); 04154 tmp3 = z1 - MULTIPLY(z3, FIX_1_847759065); 04155 04156 /* Add fudge factor here for final descale. */ 04157 z2 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 04158 z3 = (INT32) wsptr[4]; 04159 04160 tmp0 = (z2 + z3) << CONST_BITS; 04161 tmp1 = (z2 - z3) << CONST_BITS; 04162 04163 tmp10 = tmp0 + tmp2; 04164 tmp13 = tmp0 - tmp2; 04165 tmp11 = tmp1 + tmp3; 04166 tmp12 = tmp1 - tmp3; 04167 04168 /* Odd part per figure 8; the matrix is unitary and hence its 04169 * transpose is its inverse. i0..i3 are y7,y5,y3,y1 respectively. 04170 */ 04171 04172 tmp0 = (INT32) wsptr[7]; 04173 tmp1 = (INT32) wsptr[5]; 04174 tmp2 = (INT32) wsptr[3]; 04175 tmp3 = (INT32) wsptr[1]; 04176 04177 z2 = tmp0 + tmp2; 04178 z3 = tmp1 + tmp3; 04179 04180 z1 = MULTIPLY(z2 + z3, FIX_1_175875602); /* sqrt(2) * c3 */ 04181 z2 = MULTIPLY(z2, - FIX_1_961570560); /* sqrt(2) * (-c3-c5) */ 04182 z3 = MULTIPLY(z3, - FIX_0_390180644); /* sqrt(2) * (c5-c3) */ 04183 z2 += z1; 04184 z3 += z1; 04185 04186 z1 = MULTIPLY(tmp0 + tmp3, - FIX_0_899976223); /* sqrt(2) * (c7-c3) */ 04187 tmp0 = MULTIPLY(tmp0, FIX_0_298631336); /* sqrt(2) * (-c1+c3+c5-c7) */ 04188 tmp3 = MULTIPLY(tmp3, FIX_1_501321110); /* sqrt(2) * ( c1+c3-c5-c7) */ 04189 tmp0 += z1 + z2; 04190 tmp3 += z1 + z3; 04191 04192 z1 = MULTIPLY(tmp1 + tmp2, - FIX_2_562915447); /* sqrt(2) * (-c1-c3) */ 04193 tmp1 = MULTIPLY(tmp1, FIX_2_053119869); /* sqrt(2) * ( c1+c3-c5+c7) */ 04194 tmp2 = MULTIPLY(tmp2, FIX_3_072711026); /* sqrt(2) * ( c1+c3+c5-c7) */ 04195 tmp1 += z1 + z3; 04196 tmp2 += z1 + z2; 04197 04198 /* Final output stage: inputs are tmp10..tmp13, tmp0..tmp3 */ 04199 04200 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp10 + tmp3, 04201 CONST_BITS+PASS1_BITS+3) 04202 & RANGE_MASK]; 04203 outptr[7] = range_limit[(int) RIGHT_SHIFT(tmp10 - tmp3, 04204 CONST_BITS+PASS1_BITS+3) 04205 & RANGE_MASK]; 04206 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp11 + tmp2, 04207 CONST_BITS+PASS1_BITS+3) 04208 & RANGE_MASK]; 04209 outptr[6] = range_limit[(int) RIGHT_SHIFT(tmp11 - tmp2, 04210 CONST_BITS+PASS1_BITS+3) 04211 & RANGE_MASK]; 04212 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp12 + tmp1, 04213 CONST_BITS+PASS1_BITS+3) 04214 & RANGE_MASK]; 04215 outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp12 - tmp1, 04216 CONST_BITS+PASS1_BITS+3) 04217 & RANGE_MASK]; 04218 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp13 + tmp0, 04219 CONST_BITS+PASS1_BITS+3) 04220 & RANGE_MASK]; 04221 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp13 - tmp0, 04222 CONST_BITS+PASS1_BITS+3) 04223 & RANGE_MASK]; 04224 04225 wsptr += DCTSIZE; /* advance pointer to next row */ 04226 } 04227 } 04228 04229 04230 /* 04231 * Perform dequantization and inverse DCT on one block of coefficients, 04232 * producing a 7x14 output block. 04233 * 04234 * 14-point IDCT in pass 1 (columns), 7-point in pass 2 (rows). 04235 */ 04236 04237 GLOBAL(void) 04238 jpeg_idct_7x14 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 04239 JCOEFPTR coef_block, 04240 JSAMPARRAY output_buf, JDIMENSION output_col) 04241 { 04242 INT32 tmp10, tmp11, tmp12, tmp13, tmp14, tmp15, tmp16; 04243 INT32 tmp20, tmp21, tmp22, tmp23, tmp24, tmp25, tmp26; 04244 INT32 z1, z2, z3, z4; 04245 JCOEFPTR inptr; 04246 ISLOW_MULT_TYPE * quantptr; 04247 int * wsptr; 04248 JSAMPROW outptr; 04249 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 04250 int ctr; 04251 int workspace[7*14]; /* buffers data between passes */ 04252 SHIFT_TEMPS 04253 04254 /* Pass 1: process columns from input, store into work array. 04255 * 14-point IDCT kernel, cK represents sqrt(2) * cos(K*pi/28). 04256 */ 04257 inptr = coef_block; 04258 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 04259 wsptr = workspace; 04260 for (ctr = 0; ctr < 7; ctr++, inptr++, quantptr++, wsptr++) { 04261 /* Even part */ 04262 04263 z1 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 04264 z1 <<= CONST_BITS; 04265 /* Add fudge factor here for final descale. */ 04266 z1 += ONE << (CONST_BITS-PASS1_BITS-1); 04267 z4 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 04268 z2 = MULTIPLY(z4, FIX(1.274162392)); /* c4 */ 04269 z3 = MULTIPLY(z4, FIX(0.314692123)); /* c12 */ 04270 z4 = MULTIPLY(z4, FIX(0.881747734)); /* c8 */ 04271 04272 tmp10 = z1 + z2; 04273 tmp11 = z1 + z3; 04274 tmp12 = z1 - z4; 04275 04276 tmp23 = RIGHT_SHIFT(z1 - ((z2 + z3 - z4) << 1), /* c0 = (c4+c12-c8)*2 */ 04277 CONST_BITS-PASS1_BITS); 04278 04279 z1 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 04280 z2 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]); 04281 04282 z3 = MULTIPLY(z1 + z2, FIX(1.105676686)); /* c6 */ 04283 04284 tmp13 = z3 + MULTIPLY(z1, FIX(0.273079590)); /* c2-c6 */ 04285 tmp14 = z3 - MULTIPLY(z2, FIX(1.719280954)); /* c6+c10 */ 04286 tmp15 = MULTIPLY(z1, FIX(0.613604268)) - /* c10 */ 04287 MULTIPLY(z2, FIX(1.378756276)); /* c2 */ 04288 04289 tmp20 = tmp10 + tmp13; 04290 tmp26 = tmp10 - tmp13; 04291 tmp21 = tmp11 + tmp14; 04292 tmp25 = tmp11 - tmp14; 04293 tmp22 = tmp12 + tmp15; 04294 tmp24 = tmp12 - tmp15; 04295 04296 /* Odd part */ 04297 04298 z1 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 04299 z2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 04300 z3 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]); 04301 z4 = DEQUANTIZE(inptr[DCTSIZE*7], quantptr[DCTSIZE*7]); 04302 tmp13 = z4 << CONST_BITS; 04303 04304 tmp14 = z1 + z3; 04305 tmp11 = MULTIPLY(z1 + z2, FIX(1.334852607)); /* c3 */ 04306 tmp12 = MULTIPLY(tmp14, FIX(1.197448846)); /* c5 */ 04307 tmp10 = tmp11 + tmp12 + tmp13 - MULTIPLY(z1, FIX(1.126980169)); /* c3+c5-c1 */ 04308 tmp14 = MULTIPLY(tmp14, FIX(0.752406978)); /* c9 */ 04309 tmp16 = tmp14 - MULTIPLY(z1, FIX(1.061150426)); /* c9+c11-c13 */ 04310 z1 -= z2; 04311 tmp15 = MULTIPLY(z1, FIX(0.467085129)) - tmp13; /* c11 */ 04312 tmp16 += tmp15; 04313 z1 += z4; 04314 z4 = MULTIPLY(z2 + z3, - FIX(0.158341681)) - tmp13; /* -c13 */ 04315 tmp11 += z4 - MULTIPLY(z2, FIX(0.424103948)); /* c3-c9-c13 */ 04316 tmp12 += z4 - MULTIPLY(z3, FIX(2.373959773)); /* c3+c5-c13 */ 04317 z4 = MULTIPLY(z3 - z2, FIX(1.405321284)); /* c1 */ 04318 tmp14 += z4 + tmp13 - MULTIPLY(z3, FIX(1.6906431334)); /* c1+c9-c11 */ 04319 tmp15 += z4 + MULTIPLY(z2, FIX(0.674957567)); /* c1+c11-c5 */ 04320 04321 tmp13 = (z1 - z3) << PASS1_BITS; 04322 04323 /* Final output stage */ 04324 04325 wsptr[7*0] = (int) RIGHT_SHIFT(tmp20 + tmp10, CONST_BITS-PASS1_BITS); 04326 wsptr[7*13] = (int) RIGHT_SHIFT(tmp20 - tmp10, CONST_BITS-PASS1_BITS); 04327 wsptr[7*1] = (int) RIGHT_SHIFT(tmp21 + tmp11, CONST_BITS-PASS1_BITS); 04328 wsptr[7*12] = (int) RIGHT_SHIFT(tmp21 - tmp11, CONST_BITS-PASS1_BITS); 04329 wsptr[7*2] = (int) RIGHT_SHIFT(tmp22 + tmp12, CONST_BITS-PASS1_BITS); 04330 wsptr[7*11] = (int) RIGHT_SHIFT(tmp22 - tmp12, CONST_BITS-PASS1_BITS); 04331 wsptr[7*3] = (int) (tmp23 + tmp13); 04332 wsptr[7*10] = (int) (tmp23 - tmp13); 04333 wsptr[7*4] = (int) RIGHT_SHIFT(tmp24 + tmp14, CONST_BITS-PASS1_BITS); 04334 wsptr[7*9] = (int) RIGHT_SHIFT(tmp24 - tmp14, CONST_BITS-PASS1_BITS); 04335 wsptr[7*5] = (int) RIGHT_SHIFT(tmp25 + tmp15, CONST_BITS-PASS1_BITS); 04336 wsptr[7*8] = (int) RIGHT_SHIFT(tmp25 - tmp15, CONST_BITS-PASS1_BITS); 04337 wsptr[7*6] = (int) RIGHT_SHIFT(tmp26 + tmp16, CONST_BITS-PASS1_BITS); 04338 wsptr[7*7] = (int) RIGHT_SHIFT(tmp26 - tmp16, CONST_BITS-PASS1_BITS); 04339 } 04340 04341 /* Pass 2: process 14 rows from work array, store into output array. 04342 * 7-point IDCT kernel, cK represents sqrt(2) * cos(K*pi/14). 04343 */ 04344 wsptr = workspace; 04345 for (ctr = 0; ctr < 14; ctr++) { 04346 outptr = output_buf[ctr] + output_col; 04347 04348 /* Even part */ 04349 04350 /* Add fudge factor here for final descale. */ 04351 tmp23 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 04352 tmp23 <<= CONST_BITS; 04353 04354 z1 = (INT32) wsptr[2]; 04355 z2 = (INT32) wsptr[4]; 04356 z3 = (INT32) wsptr[6]; 04357 04358 tmp20 = MULTIPLY(z2 - z3, FIX(0.881747734)); /* c4 */ 04359 tmp22 = MULTIPLY(z1 - z2, FIX(0.314692123)); /* c6 */ 04360 tmp21 = tmp20 + tmp22 + tmp23 - MULTIPLY(z2, FIX(1.841218003)); /* c2+c4-c6 */ 04361 tmp10 = z1 + z3; 04362 z2 -= tmp10; 04363 tmp10 = MULTIPLY(tmp10, FIX(1.274162392)) + tmp23; /* c2 */ 04364 tmp20 += tmp10 - MULTIPLY(z3, FIX(0.077722536)); /* c2-c4-c6 */ 04365 tmp22 += tmp10 - MULTIPLY(z1, FIX(2.470602249)); /* c2+c4+c6 */ 04366 tmp23 += MULTIPLY(z2, FIX(1.414213562)); /* c0 */ 04367 04368 /* Odd part */ 04369 04370 z1 = (INT32) wsptr[1]; 04371 z2 = (INT32) wsptr[3]; 04372 z3 = (INT32) wsptr[5]; 04373 04374 tmp11 = MULTIPLY(z1 + z2, FIX(0.935414347)); /* (c3+c1-c5)/2 */ 04375 tmp12 = MULTIPLY(z1 - z2, FIX(0.170262339)); /* (c3+c5-c1)/2 */ 04376 tmp10 = tmp11 - tmp12; 04377 tmp11 += tmp12; 04378 tmp12 = MULTIPLY(z2 + z3, - FIX(1.378756276)); /* -c1 */ 04379 tmp11 += tmp12; 04380 z2 = MULTIPLY(z1 + z3, FIX(0.613604268)); /* c5 */ 04381 tmp10 += z2; 04382 tmp12 += z2 + MULTIPLY(z3, FIX(1.870828693)); /* c3+c1-c5 */ 04383 04384 /* Final output stage */ 04385 04386 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp20 + tmp10, 04387 CONST_BITS+PASS1_BITS+3) 04388 & RANGE_MASK]; 04389 outptr[6] = range_limit[(int) RIGHT_SHIFT(tmp20 - tmp10, 04390 CONST_BITS+PASS1_BITS+3) 04391 & RANGE_MASK]; 04392 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp21 + tmp11, 04393 CONST_BITS+PASS1_BITS+3) 04394 & RANGE_MASK]; 04395 outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp21 - tmp11, 04396 CONST_BITS+PASS1_BITS+3) 04397 & RANGE_MASK]; 04398 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp22 + tmp12, 04399 CONST_BITS+PASS1_BITS+3) 04400 & RANGE_MASK]; 04401 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp22 - tmp12, 04402 CONST_BITS+PASS1_BITS+3) 04403 & RANGE_MASK]; 04404 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp23, 04405 CONST_BITS+PASS1_BITS+3) 04406 & RANGE_MASK]; 04407 04408 wsptr += 7; /* advance pointer to next row */ 04409 } 04410 } 04411 04412 04413 /* 04414 * Perform dequantization and inverse DCT on one block of coefficients, 04415 * producing a 6x12 output block. 04416 * 04417 * 12-point IDCT in pass 1 (columns), 6-point in pass 2 (rows). 04418 */ 04419 04420 GLOBAL(void) 04421 jpeg_idct_6x12 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 04422 JCOEFPTR coef_block, 04423 JSAMPARRAY output_buf, JDIMENSION output_col) 04424 { 04425 INT32 tmp10, tmp11, tmp12, tmp13, tmp14, tmp15; 04426 INT32 tmp20, tmp21, tmp22, tmp23, tmp24, tmp25; 04427 INT32 z1, z2, z3, z4; 04428 JCOEFPTR inptr; 04429 ISLOW_MULT_TYPE * quantptr; 04430 int * wsptr; 04431 JSAMPROW outptr; 04432 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 04433 int ctr; 04434 int workspace[6*12]; /* buffers data between passes */ 04435 SHIFT_TEMPS 04436 04437 /* Pass 1: process columns from input, store into work array. 04438 * 12-point IDCT kernel, cK represents sqrt(2) * cos(K*pi/24). 04439 */ 04440 inptr = coef_block; 04441 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 04442 wsptr = workspace; 04443 for (ctr = 0; ctr < 6; ctr++, inptr++, quantptr++, wsptr++) { 04444 /* Even part */ 04445 04446 z3 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 04447 z3 <<= CONST_BITS; 04448 /* Add fudge factor here for final descale. */ 04449 z3 += ONE << (CONST_BITS-PASS1_BITS-1); 04450 04451 z4 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 04452 z4 = MULTIPLY(z4, FIX(1.224744871)); /* c4 */ 04453 04454 tmp10 = z3 + z4; 04455 tmp11 = z3 - z4; 04456 04457 z1 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 04458 z4 = MULTIPLY(z1, FIX(1.366025404)); /* c2 */ 04459 z1 <<= CONST_BITS; 04460 z2 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]); 04461 z2 <<= CONST_BITS; 04462 04463 tmp12 = z1 - z2; 04464 04465 tmp21 = z3 + tmp12; 04466 tmp24 = z3 - tmp12; 04467 04468 tmp12 = z4 + z2; 04469 04470 tmp20 = tmp10 + tmp12; 04471 tmp25 = tmp10 - tmp12; 04472 04473 tmp12 = z4 - z1 - z2; 04474 04475 tmp22 = tmp11 + tmp12; 04476 tmp23 = tmp11 - tmp12; 04477 04478 /* Odd part */ 04479 04480 z1 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 04481 z2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 04482 z3 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]); 04483 z4 = DEQUANTIZE(inptr[DCTSIZE*7], quantptr[DCTSIZE*7]); 04484 04485 tmp11 = MULTIPLY(z2, FIX(1.306562965)); /* c3 */ 04486 tmp14 = MULTIPLY(z2, - FIX_0_541196100); /* -c9 */ 04487 04488 tmp10 = z1 + z3; 04489 tmp15 = MULTIPLY(tmp10 + z4, FIX(0.860918669)); /* c7 */ 04490 tmp12 = tmp15 + MULTIPLY(tmp10, FIX(0.261052384)); /* c5-c7 */ 04491 tmp10 = tmp12 + tmp11 + MULTIPLY(z1, FIX(0.280143716)); /* c1-c5 */ 04492 tmp13 = MULTIPLY(z3 + z4, - FIX(1.045510580)); /* -(c7+c11) */ 04493 tmp12 += tmp13 + tmp14 - MULTIPLY(z3, FIX(1.478575242)); /* c1+c5-c7-c11 */ 04494 tmp13 += tmp15 - tmp11 + MULTIPLY(z4, FIX(1.586706681)); /* c1+c11 */ 04495 tmp15 += tmp14 - MULTIPLY(z1, FIX(0.676326758)) - /* c7-c11 */ 04496 MULTIPLY(z4, FIX(1.982889723)); /* c5+c7 */ 04497 04498 z1 -= z4; 04499 z2 -= z3; 04500 z3 = MULTIPLY(z1 + z2, FIX_0_541196100); /* c9 */ 04501 tmp11 = z3 + MULTIPLY(z1, FIX_0_765366865); /* c3-c9 */ 04502 tmp14 = z3 - MULTIPLY(z2, FIX_1_847759065); /* c3+c9 */ 04503 04504 /* Final output stage */ 04505 04506 wsptr[6*0] = (int) RIGHT_SHIFT(tmp20 + tmp10, CONST_BITS-PASS1_BITS); 04507 wsptr[6*11] = (int) RIGHT_SHIFT(tmp20 - tmp10, CONST_BITS-PASS1_BITS); 04508 wsptr[6*1] = (int) RIGHT_SHIFT(tmp21 + tmp11, CONST_BITS-PASS1_BITS); 04509 wsptr[6*10] = (int) RIGHT_SHIFT(tmp21 - tmp11, CONST_BITS-PASS1_BITS); 04510 wsptr[6*2] = (int) RIGHT_SHIFT(tmp22 + tmp12, CONST_BITS-PASS1_BITS); 04511 wsptr[6*9] = (int) RIGHT_SHIFT(tmp22 - tmp12, CONST_BITS-PASS1_BITS); 04512 wsptr[6*3] = (int) RIGHT_SHIFT(tmp23 + tmp13, CONST_BITS-PASS1_BITS); 04513 wsptr[6*8] = (int) RIGHT_SHIFT(tmp23 - tmp13, CONST_BITS-PASS1_BITS); 04514 wsptr[6*4] = (int) RIGHT_SHIFT(tmp24 + tmp14, CONST_BITS-PASS1_BITS); 04515 wsptr[6*7] = (int) RIGHT_SHIFT(tmp24 - tmp14, CONST_BITS-PASS1_BITS); 04516 wsptr[6*5] = (int) RIGHT_SHIFT(tmp25 + tmp15, CONST_BITS-PASS1_BITS); 04517 wsptr[6*6] = (int) RIGHT_SHIFT(tmp25 - tmp15, CONST_BITS-PASS1_BITS); 04518 } 04519 04520 /* Pass 2: process 12 rows from work array, store into output array. 04521 * 6-point IDCT kernel, cK represents sqrt(2) * cos(K*pi/12). 04522 */ 04523 wsptr = workspace; 04524 for (ctr = 0; ctr < 12; ctr++) { 04525 outptr = output_buf[ctr] + output_col; 04526 04527 /* Even part */ 04528 04529 /* Add fudge factor here for final descale. */ 04530 tmp10 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 04531 tmp10 <<= CONST_BITS; 04532 tmp12 = (INT32) wsptr[4]; 04533 tmp20 = MULTIPLY(tmp12, FIX(0.707106781)); /* c4 */ 04534 tmp11 = tmp10 + tmp20; 04535 tmp21 = tmp10 - tmp20 - tmp20; 04536 tmp20 = (INT32) wsptr[2]; 04537 tmp10 = MULTIPLY(tmp20, FIX(1.224744871)); /* c2 */ 04538 tmp20 = tmp11 + tmp10; 04539 tmp22 = tmp11 - tmp10; 04540 04541 /* Odd part */ 04542 04543 z1 = (INT32) wsptr[1]; 04544 z2 = (INT32) wsptr[3]; 04545 z3 = (INT32) wsptr[5]; 04546 tmp11 = MULTIPLY(z1 + z3, FIX(0.366025404)); /* c5 */ 04547 tmp10 = tmp11 + ((z1 + z2) << CONST_BITS); 04548 tmp12 = tmp11 + ((z3 - z2) << CONST_BITS); 04549 tmp11 = (z1 - z2 - z3) << CONST_BITS; 04550 04551 /* Final output stage */ 04552 04553 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp20 + tmp10, 04554 CONST_BITS+PASS1_BITS+3) 04555 & RANGE_MASK]; 04556 outptr[5] = range_limit[(int) RIGHT_SHIFT(tmp20 - tmp10, 04557 CONST_BITS+PASS1_BITS+3) 04558 & RANGE_MASK]; 04559 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp21 + tmp11, 04560 CONST_BITS+PASS1_BITS+3) 04561 & RANGE_MASK]; 04562 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp21 - tmp11, 04563 CONST_BITS+PASS1_BITS+3) 04564 & RANGE_MASK]; 04565 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp22 + tmp12, 04566 CONST_BITS+PASS1_BITS+3) 04567 & RANGE_MASK]; 04568 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp22 - tmp12, 04569 CONST_BITS+PASS1_BITS+3) 04570 & RANGE_MASK]; 04571 04572 wsptr += 6; /* advance pointer to next row */ 04573 } 04574 } 04575 04576 04577 /* 04578 * Perform dequantization and inverse DCT on one block of coefficients, 04579 * producing a 5x10 output block. 04580 * 04581 * 10-point IDCT in pass 1 (columns), 5-point in pass 2 (rows). 04582 */ 04583 04584 GLOBAL(void) 04585 jpeg_idct_5x10 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 04586 JCOEFPTR coef_block, 04587 JSAMPARRAY output_buf, JDIMENSION output_col) 04588 { 04589 INT32 tmp10, tmp11, tmp12, tmp13, tmp14; 04590 INT32 tmp20, tmp21, tmp22, tmp23, tmp24; 04591 INT32 z1, z2, z3, z4, z5; 04592 JCOEFPTR inptr; 04593 ISLOW_MULT_TYPE * quantptr; 04594 int * wsptr; 04595 JSAMPROW outptr; 04596 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 04597 int ctr; 04598 int workspace[5*10]; /* buffers data between passes */ 04599 SHIFT_TEMPS 04600 04601 /* Pass 1: process columns from input, store into work array. 04602 * 10-point IDCT kernel, cK represents sqrt(2) * cos(K*pi/20). 04603 */ 04604 inptr = coef_block; 04605 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 04606 wsptr = workspace; 04607 for (ctr = 0; ctr < 5; ctr++, inptr++, quantptr++, wsptr++) { 04608 /* Even part */ 04609 04610 z3 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 04611 z3 <<= CONST_BITS; 04612 /* Add fudge factor here for final descale. */ 04613 z3 += ONE << (CONST_BITS-PASS1_BITS-1); 04614 z4 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 04615 z1 = MULTIPLY(z4, FIX(1.144122806)); /* c4 */ 04616 z2 = MULTIPLY(z4, FIX(0.437016024)); /* c8 */ 04617 tmp10 = z3 + z1; 04618 tmp11 = z3 - z2; 04619 04620 tmp22 = RIGHT_SHIFT(z3 - ((z1 - z2) << 1), /* c0 = (c4-c8)*2 */ 04621 CONST_BITS-PASS1_BITS); 04622 04623 z2 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 04624 z3 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]); 04625 04626 z1 = MULTIPLY(z2 + z3, FIX(0.831253876)); /* c6 */ 04627 tmp12 = z1 + MULTIPLY(z2, FIX(0.513743148)); /* c2-c6 */ 04628 tmp13 = z1 - MULTIPLY(z3, FIX(2.176250899)); /* c2+c6 */ 04629 04630 tmp20 = tmp10 + tmp12; 04631 tmp24 = tmp10 - tmp12; 04632 tmp21 = tmp11 + tmp13; 04633 tmp23 = tmp11 - tmp13; 04634 04635 /* Odd part */ 04636 04637 z1 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 04638 z2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 04639 z3 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]); 04640 z4 = DEQUANTIZE(inptr[DCTSIZE*7], quantptr[DCTSIZE*7]); 04641 04642 tmp11 = z2 + z4; 04643 tmp13 = z2 - z4; 04644 04645 tmp12 = MULTIPLY(tmp13, FIX(0.309016994)); /* (c3-c7)/2 */ 04646 z5 = z3 << CONST_BITS; 04647 04648 z2 = MULTIPLY(tmp11, FIX(0.951056516)); /* (c3+c7)/2 */ 04649 z4 = z5 + tmp12; 04650 04651 tmp10 = MULTIPLY(z1, FIX(1.396802247)) + z2 + z4; /* c1 */ 04652 tmp14 = MULTIPLY(z1, FIX(0.221231742)) - z2 + z4; /* c9 */ 04653 04654 z2 = MULTIPLY(tmp11, FIX(0.587785252)); /* (c1-c9)/2 */ 04655 z4 = z5 - tmp12 - (tmp13 << (CONST_BITS - 1)); 04656 04657 tmp12 = (z1 - tmp13 - z3) << PASS1_BITS; 04658 04659 tmp11 = MULTIPLY(z1, FIX(1.260073511)) - z2 - z4; /* c3 */ 04660 tmp13 = MULTIPLY(z1, FIX(0.642039522)) - z2 + z4; /* c7 */ 04661 04662 /* Final output stage */ 04663 04664 wsptr[5*0] = (int) RIGHT_SHIFT(tmp20 + tmp10, CONST_BITS-PASS1_BITS); 04665 wsptr[5*9] = (int) RIGHT_SHIFT(tmp20 - tmp10, CONST_BITS-PASS1_BITS); 04666 wsptr[5*1] = (int) RIGHT_SHIFT(tmp21 + tmp11, CONST_BITS-PASS1_BITS); 04667 wsptr[5*8] = (int) RIGHT_SHIFT(tmp21 - tmp11, CONST_BITS-PASS1_BITS); 04668 wsptr[5*2] = (int) (tmp22 + tmp12); 04669 wsptr[5*7] = (int) (tmp22 - tmp12); 04670 wsptr[5*3] = (int) RIGHT_SHIFT(tmp23 + tmp13, CONST_BITS-PASS1_BITS); 04671 wsptr[5*6] = (int) RIGHT_SHIFT(tmp23 - tmp13, CONST_BITS-PASS1_BITS); 04672 wsptr[5*4] = (int) RIGHT_SHIFT(tmp24 + tmp14, CONST_BITS-PASS1_BITS); 04673 wsptr[5*5] = (int) RIGHT_SHIFT(tmp24 - tmp14, CONST_BITS-PASS1_BITS); 04674 } 04675 04676 /* Pass 2: process 10 rows from work array, store into output array. 04677 * 5-point IDCT kernel, cK represents sqrt(2) * cos(K*pi/10). 04678 */ 04679 wsptr = workspace; 04680 for (ctr = 0; ctr < 10; ctr++) { 04681 outptr = output_buf[ctr] + output_col; 04682 04683 /* Even part */ 04684 04685 /* Add fudge factor here for final descale. */ 04686 tmp12 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 04687 tmp12 <<= CONST_BITS; 04688 tmp13 = (INT32) wsptr[2]; 04689 tmp14 = (INT32) wsptr[4]; 04690 z1 = MULTIPLY(tmp13 + tmp14, FIX(0.790569415)); /* (c2+c4)/2 */ 04691 z2 = MULTIPLY(tmp13 - tmp14, FIX(0.353553391)); /* (c2-c4)/2 */ 04692 z3 = tmp12 + z2; 04693 tmp10 = z3 + z1; 04694 tmp11 = z3 - z1; 04695 tmp12 -= z2 << 2; 04696 04697 /* Odd part */ 04698 04699 z2 = (INT32) wsptr[1]; 04700 z3 = (INT32) wsptr[3]; 04701 04702 z1 = MULTIPLY(z2 + z3, FIX(0.831253876)); /* c3 */ 04703 tmp13 = z1 + MULTIPLY(z2, FIX(0.513743148)); /* c1-c3 */ 04704 tmp14 = z1 - MULTIPLY(z3, FIX(2.176250899)); /* c1+c3 */ 04705 04706 /* Final output stage */ 04707 04708 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp10 + tmp13, 04709 CONST_BITS+PASS1_BITS+3) 04710 & RANGE_MASK]; 04711 outptr[4] = range_limit[(int) RIGHT_SHIFT(tmp10 - tmp13, 04712 CONST_BITS+PASS1_BITS+3) 04713 & RANGE_MASK]; 04714 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp11 + tmp14, 04715 CONST_BITS+PASS1_BITS+3) 04716 & RANGE_MASK]; 04717 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp11 - tmp14, 04718 CONST_BITS+PASS1_BITS+3) 04719 & RANGE_MASK]; 04720 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp12, 04721 CONST_BITS+PASS1_BITS+3) 04722 & RANGE_MASK]; 04723 04724 wsptr += 5; /* advance pointer to next row */ 04725 } 04726 } 04727 04728 04729 /* 04730 * Perform dequantization and inverse DCT on one block of coefficients, 04731 * producing a 4x8 output block. 04732 * 04733 * 8-point IDCT in pass 1 (columns), 4-point in pass 2 (rows). 04734 */ 04735 04736 GLOBAL(void) 04737 jpeg_idct_4x8 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 04738 JCOEFPTR coef_block, 04739 JSAMPARRAY output_buf, JDIMENSION output_col) 04740 { 04741 INT32 tmp0, tmp1, tmp2, tmp3; 04742 INT32 tmp10, tmp11, tmp12, tmp13; 04743 INT32 z1, z2, z3; 04744 JCOEFPTR inptr; 04745 ISLOW_MULT_TYPE * quantptr; 04746 int * wsptr; 04747 JSAMPROW outptr; 04748 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 04749 int ctr; 04750 int workspace[4*8]; /* buffers data between passes */ 04751 SHIFT_TEMPS 04752 04753 /* Pass 1: process columns from input, store into work array. */ 04754 /* Note results are scaled up by sqrt(8) compared to a true IDCT; */ 04755 /* furthermore, we scale the results by 2**PASS1_BITS. */ 04756 04757 inptr = coef_block; 04758 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 04759 wsptr = workspace; 04760 for (ctr = 4; ctr > 0; ctr--) { 04761 /* Due to quantization, we will usually find that many of the input 04762 * coefficients are zero, especially the AC terms. We can exploit this 04763 * by short-circuiting the IDCT calculation for any column in which all 04764 * the AC terms are zero. In that case each output is equal to the 04765 * DC coefficient (with scale factor as needed). 04766 * With typical images and quantization tables, half or more of the 04767 * column DCT calculations can be simplified this way. 04768 */ 04769 04770 if (inptr[DCTSIZE*1] == 0 && inptr[DCTSIZE*2] == 0 && 04771 inptr[DCTSIZE*3] == 0 && inptr[DCTSIZE*4] == 0 && 04772 inptr[DCTSIZE*5] == 0 && inptr[DCTSIZE*6] == 0 && 04773 inptr[DCTSIZE*7] == 0) { 04774 /* AC terms all zero */ 04775 int dcval = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]) << PASS1_BITS; 04776 04777 wsptr[4*0] = dcval; 04778 wsptr[4*1] = dcval; 04779 wsptr[4*2] = dcval; 04780 wsptr[4*3] = dcval; 04781 wsptr[4*4] = dcval; 04782 wsptr[4*5] = dcval; 04783 wsptr[4*6] = dcval; 04784 wsptr[4*7] = dcval; 04785 04786 inptr++; /* advance pointers to next column */ 04787 quantptr++; 04788 wsptr++; 04789 continue; 04790 } 04791 04792 /* Even part: reverse the even part of the forward DCT. */ 04793 /* The rotator is sqrt(2)*c(-6). */ 04794 04795 z2 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 04796 z3 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]); 04797 04798 z1 = MULTIPLY(z2 + z3, FIX_0_541196100); 04799 tmp2 = z1 + MULTIPLY(z2, FIX_0_765366865); 04800 tmp3 = z1 - MULTIPLY(z3, FIX_1_847759065); 04801 04802 z2 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 04803 z3 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 04804 z2 <<= CONST_BITS; 04805 z3 <<= CONST_BITS; 04806 /* Add fudge factor here for final descale. */ 04807 z2 += ONE << (CONST_BITS-PASS1_BITS-1); 04808 04809 tmp0 = z2 + z3; 04810 tmp1 = z2 - z3; 04811 04812 tmp10 = tmp0 + tmp2; 04813 tmp13 = tmp0 - tmp2; 04814 tmp11 = tmp1 + tmp3; 04815 tmp12 = tmp1 - tmp3; 04816 04817 /* Odd part per figure 8; the matrix is unitary and hence its 04818 * transpose is its inverse. i0..i3 are y7,y5,y3,y1 respectively. 04819 */ 04820 04821 tmp0 = DEQUANTIZE(inptr[DCTSIZE*7], quantptr[DCTSIZE*7]); 04822 tmp1 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]); 04823 tmp2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 04824 tmp3 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 04825 04826 z2 = tmp0 + tmp2; 04827 z3 = tmp1 + tmp3; 04828 04829 z1 = MULTIPLY(z2 + z3, FIX_1_175875602); /* sqrt(2) * c3 */ 04830 z2 = MULTIPLY(z2, - FIX_1_961570560); /* sqrt(2) * (-c3-c5) */ 04831 z3 = MULTIPLY(z3, - FIX_0_390180644); /* sqrt(2) * (c5-c3) */ 04832 z2 += z1; 04833 z3 += z1; 04834 04835 z1 = MULTIPLY(tmp0 + tmp3, - FIX_0_899976223); /* sqrt(2) * (c7-c3) */ 04836 tmp0 = MULTIPLY(tmp0, FIX_0_298631336); /* sqrt(2) * (-c1+c3+c5-c7) */ 04837 tmp3 = MULTIPLY(tmp3, FIX_1_501321110); /* sqrt(2) * ( c1+c3-c5-c7) */ 04838 tmp0 += z1 + z2; 04839 tmp3 += z1 + z3; 04840 04841 z1 = MULTIPLY(tmp1 + tmp2, - FIX_2_562915447); /* sqrt(2) * (-c1-c3) */ 04842 tmp1 = MULTIPLY(tmp1, FIX_2_053119869); /* sqrt(2) * ( c1+c3-c5+c7) */ 04843 tmp2 = MULTIPLY(tmp2, FIX_3_072711026); /* sqrt(2) * ( c1+c3+c5-c7) */ 04844 tmp1 += z1 + z3; 04845 tmp2 += z1 + z2; 04846 04847 /* Final output stage: inputs are tmp10..tmp13, tmp0..tmp3 */ 04848 04849 wsptr[4*0] = (int) RIGHT_SHIFT(tmp10 + tmp3, CONST_BITS-PASS1_BITS); 04850 wsptr[4*7] = (int) RIGHT_SHIFT(tmp10 - tmp3, CONST_BITS-PASS1_BITS); 04851 wsptr[4*1] = (int) RIGHT_SHIFT(tmp11 + tmp2, CONST_BITS-PASS1_BITS); 04852 wsptr[4*6] = (int) RIGHT_SHIFT(tmp11 - tmp2, CONST_BITS-PASS1_BITS); 04853 wsptr[4*2] = (int) RIGHT_SHIFT(tmp12 + tmp1, CONST_BITS-PASS1_BITS); 04854 wsptr[4*5] = (int) RIGHT_SHIFT(tmp12 - tmp1, CONST_BITS-PASS1_BITS); 04855 wsptr[4*3] = (int) RIGHT_SHIFT(tmp13 + tmp0, CONST_BITS-PASS1_BITS); 04856 wsptr[4*4] = (int) RIGHT_SHIFT(tmp13 - tmp0, CONST_BITS-PASS1_BITS); 04857 04858 inptr++; /* advance pointers to next column */ 04859 quantptr++; 04860 wsptr++; 04861 } 04862 04863 /* Pass 2: process 8 rows from work array, store into output array. 04864 * 4-point IDCT kernel, cK represents sqrt(2) * cos(K*pi/16). 04865 */ 04866 wsptr = workspace; 04867 for (ctr = 0; ctr < 8; ctr++) { 04868 outptr = output_buf[ctr] + output_col; 04869 04870 /* Even part */ 04871 04872 /* Add fudge factor here for final descale. */ 04873 tmp0 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 04874 tmp2 = (INT32) wsptr[2]; 04875 04876 tmp10 = (tmp0 + tmp2) << CONST_BITS; 04877 tmp12 = (tmp0 - tmp2) << CONST_BITS; 04878 04879 /* Odd part */ 04880 /* Same rotation as in the even part of the 8x8 LL&M IDCT */ 04881 04882 z2 = (INT32) wsptr[1]; 04883 z3 = (INT32) wsptr[3]; 04884 04885 z1 = MULTIPLY(z2 + z3, FIX_0_541196100); /* c6 */ 04886 tmp0 = z1 + MULTIPLY(z2, FIX_0_765366865); /* c2-c6 */ 04887 tmp2 = z1 - MULTIPLY(z3, FIX_1_847759065); /* c2+c6 */ 04888 04889 /* Final output stage */ 04890 04891 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp10 + tmp0, 04892 CONST_BITS+PASS1_BITS+3) 04893 & RANGE_MASK]; 04894 outptr[3] = range_limit[(int) RIGHT_SHIFT(tmp10 - tmp0, 04895 CONST_BITS+PASS1_BITS+3) 04896 & RANGE_MASK]; 04897 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp12 + tmp2, 04898 CONST_BITS+PASS1_BITS+3) 04899 & RANGE_MASK]; 04900 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp12 - tmp2, 04901 CONST_BITS+PASS1_BITS+3) 04902 & RANGE_MASK]; 04903 04904 wsptr += 4; /* advance pointer to next row */ 04905 } 04906 } 04907 04908 04909 /* 04910 * Perform dequantization and inverse DCT on one block of coefficients, 04911 * producing a reduced-size 3x6 output block. 04912 * 04913 * 6-point IDCT in pass 1 (columns), 3-point in pass 2 (rows). 04914 */ 04915 04916 GLOBAL(void) 04917 jpeg_idct_3x6 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 04918 JCOEFPTR coef_block, 04919 JSAMPARRAY output_buf, JDIMENSION output_col) 04920 { 04921 INT32 tmp0, tmp1, tmp2, tmp10, tmp11, tmp12; 04922 INT32 z1, z2, z3; 04923 JCOEFPTR inptr; 04924 ISLOW_MULT_TYPE * quantptr; 04925 int * wsptr; 04926 JSAMPROW outptr; 04927 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 04928 int ctr; 04929 int workspace[3*6]; /* buffers data between passes */ 04930 SHIFT_TEMPS 04931 04932 /* Pass 1: process columns from input, store into work array. 04933 * 6-point IDCT kernel, cK represents sqrt(2) * cos(K*pi/12). 04934 */ 04935 inptr = coef_block; 04936 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 04937 wsptr = workspace; 04938 for (ctr = 0; ctr < 3; ctr++, inptr++, quantptr++, wsptr++) { 04939 /* Even part */ 04940 04941 tmp0 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 04942 tmp0 <<= CONST_BITS; 04943 /* Add fudge factor here for final descale. */ 04944 tmp0 += ONE << (CONST_BITS-PASS1_BITS-1); 04945 tmp2 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]); 04946 tmp10 = MULTIPLY(tmp2, FIX(0.707106781)); /* c4 */ 04947 tmp1 = tmp0 + tmp10; 04948 tmp11 = RIGHT_SHIFT(tmp0 - tmp10 - tmp10, CONST_BITS-PASS1_BITS); 04949 tmp10 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 04950 tmp0 = MULTIPLY(tmp10, FIX(1.224744871)); /* c2 */ 04951 tmp10 = tmp1 + tmp0; 04952 tmp12 = tmp1 - tmp0; 04953 04954 /* Odd part */ 04955 04956 z1 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 04957 z2 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 04958 z3 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]); 04959 tmp1 = MULTIPLY(z1 + z3, FIX(0.366025404)); /* c5 */ 04960 tmp0 = tmp1 + ((z1 + z2) << CONST_BITS); 04961 tmp2 = tmp1 + ((z3 - z2) << CONST_BITS); 04962 tmp1 = (z1 - z2 - z3) << PASS1_BITS; 04963 04964 /* Final output stage */ 04965 04966 wsptr[3*0] = (int) RIGHT_SHIFT(tmp10 + tmp0, CONST_BITS-PASS1_BITS); 04967 wsptr[3*5] = (int) RIGHT_SHIFT(tmp10 - tmp0, CONST_BITS-PASS1_BITS); 04968 wsptr[3*1] = (int) (tmp11 + tmp1); 04969 wsptr[3*4] = (int) (tmp11 - tmp1); 04970 wsptr[3*2] = (int) RIGHT_SHIFT(tmp12 + tmp2, CONST_BITS-PASS1_BITS); 04971 wsptr[3*3] = (int) RIGHT_SHIFT(tmp12 - tmp2, CONST_BITS-PASS1_BITS); 04972 } 04973 04974 /* Pass 2: process 6 rows from work array, store into output array. 04975 * 3-point IDCT kernel, cK represents sqrt(2) * cos(K*pi/6). 04976 */ 04977 wsptr = workspace; 04978 for (ctr = 0; ctr < 6; ctr++) { 04979 outptr = output_buf[ctr] + output_col; 04980 04981 /* Even part */ 04982 04983 /* Add fudge factor here for final descale. */ 04984 tmp0 = (INT32) wsptr[0] + (ONE << (PASS1_BITS+2)); 04985 tmp0 <<= CONST_BITS; 04986 tmp2 = (INT32) wsptr[2]; 04987 tmp12 = MULTIPLY(tmp2, FIX(0.707106781)); /* c2 */ 04988 tmp10 = tmp0 + tmp12; 04989 tmp2 = tmp0 - tmp12 - tmp12; 04990 04991 /* Odd part */ 04992 04993 tmp12 = (INT32) wsptr[1]; 04994 tmp0 = MULTIPLY(tmp12, FIX(1.224744871)); /* c1 */ 04995 04996 /* Final output stage */ 04997 04998 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp10 + tmp0, 04999 CONST_BITS+PASS1_BITS+3) 05000 & RANGE_MASK]; 05001 outptr[2] = range_limit[(int) RIGHT_SHIFT(tmp10 - tmp0, 05002 CONST_BITS+PASS1_BITS+3) 05003 & RANGE_MASK]; 05004 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp2, 05005 CONST_BITS+PASS1_BITS+3) 05006 & RANGE_MASK]; 05007 05008 wsptr += 3; /* advance pointer to next row */ 05009 } 05010 } 05011 05012 05013 /* 05014 * Perform dequantization and inverse DCT on one block of coefficients, 05015 * producing a 2x4 output block. 05016 * 05017 * 4-point IDCT in pass 1 (columns), 2-point in pass 2 (rows). 05018 */ 05019 05020 GLOBAL(void) 05021 jpeg_idct_2x4 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 05022 JCOEFPTR coef_block, 05023 JSAMPARRAY output_buf, JDIMENSION output_col) 05024 { 05025 INT32 tmp0, tmp2, tmp10, tmp12; 05026 INT32 z1, z2, z3; 05027 JCOEFPTR inptr; 05028 ISLOW_MULT_TYPE * quantptr; 05029 INT32 * wsptr; 05030 JSAMPROW outptr; 05031 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 05032 int ctr; 05033 INT32 workspace[2*4]; /* buffers data between passes */ 05034 SHIFT_TEMPS 05035 05036 /* Pass 1: process columns from input, store into work array. 05037 * 4-point IDCT kernel, 05038 * cK represents sqrt(2) * cos(K*pi/16) [refers to 8-point IDCT]. 05039 */ 05040 inptr = coef_block; 05041 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 05042 wsptr = workspace; 05043 for (ctr = 0; ctr < 2; ctr++, inptr++, quantptr++, wsptr++) { 05044 /* Even part */ 05045 05046 tmp0 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]); 05047 tmp2 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]); 05048 05049 tmp10 = (tmp0 + tmp2) << CONST_BITS; 05050 tmp12 = (tmp0 - tmp2) << CONST_BITS; 05051 05052 /* Odd part */ 05053 /* Same rotation as in the even part of the 8x8 LL&M IDCT */ 05054 05055 z2 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]); 05056 z3 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]); 05057 05058 z1 = MULTIPLY(z2 + z3, FIX_0_541196100); /* c6 */ 05059 tmp0 = z1 + MULTIPLY(z2, FIX_0_765366865); /* c2-c6 */ 05060 tmp2 = z1 - MULTIPLY(z3, FIX_1_847759065); /* c2+c6 */ 05061 05062 /* Final output stage */ 05063 05064 wsptr[2*0] = tmp10 + tmp0; 05065 wsptr[2*3] = tmp10 - tmp0; 05066 wsptr[2*1] = tmp12 + tmp2; 05067 wsptr[2*2] = tmp12 - tmp2; 05068 } 05069 05070 /* Pass 2: process 4 rows from work array, store into output array. */ 05071 05072 wsptr = workspace; 05073 for (ctr = 0; ctr < 4; ctr++) { 05074 outptr = output_buf[ctr] + output_col; 05075 05076 /* Even part */ 05077 05078 /* Add fudge factor here for final descale. */ 05079 tmp10 = wsptr[0] + (ONE << (CONST_BITS+2)); 05080 05081 /* Odd part */ 05082 05083 tmp0 = wsptr[1]; 05084 05085 /* Final output stage */ 05086 05087 outptr[0] = range_limit[(int) RIGHT_SHIFT(tmp10 + tmp0, CONST_BITS+3) 05088 & RANGE_MASK]; 05089 outptr[1] = range_limit[(int) RIGHT_SHIFT(tmp10 - tmp0, CONST_BITS+3) 05090 & RANGE_MASK]; 05091 05092 wsptr += 2; /* advance pointer to next row */ 05093 } 05094 } 05095 05096 05097 /* 05098 * Perform dequantization and inverse DCT on one block of coefficients, 05099 * producing a 1x2 output block. 05100 * 05101 * 2-point IDCT in pass 1 (columns), 1-point in pass 2 (rows). 05102 */ 05103 05104 GLOBAL(void) 05105 jpeg_idct_1x2 (j_decompress_ptr cinfo, jpeg_component_info * compptr, 05106 JCOEFPTR coef_block, 05107 JSAMPARRAY output_buf, JDIMENSION output_col) 05108 { 05109 INT32 tmp0, tmp10; 05110 ISLOW_MULT_TYPE * quantptr; 05111 JSAMPLE *range_limit = IDCT_range_limit(cinfo); 05112 SHIFT_TEMPS 05113 05114 /* Process 1 column from input, store into output array. */ 05115 05116 quantptr = (ISLOW_MULT_TYPE *) compptr->dct_table; 05117 05118 /* Even part */ 05119 05120 tmp10 = DEQUANTIZE(coef_block[DCTSIZE*0], quantptr[DCTSIZE*0]); 05121 /* Add fudge factor here for final descale. */ 05122 tmp10 += ONE << 2; 05123 05124 /* Odd part */ 05125 05126 tmp0 = DEQUANTIZE(coef_block[DCTSIZE*1], quantptr[DCTSIZE*1]); 05127 05128 /* Final output stage */ 05129 05130 output_buf[0][output_col] = range_limit[(int) RIGHT_SHIFT(tmp10 + tmp0, 3) 05131 & RANGE_MASK]; 05132 output_buf[1][output_col] = range_limit[(int) RIGHT_SHIFT(tmp10 - tmp0, 3) 05133 & RANGE_MASK]; 05134 } 05135 05136 #endif /* IDCT_SCALING_SUPPORTED */ 05137 #endif /* DCT_ISLOW_SUPPORTED */ Generated on Mon May 28 2012 04:19:13 for ReactOS by
1.7.6.1
|