Inferno #10
30 апреля 2007 |
|
Formats - details on the decoder jpeg.
Hello! Remember me? :) Since the topic of this article interest to many, I'm not thinking long, he decided to scratch an article. Despite all the apparent complexity, I will try to present all simple, understandable form. I want to warn: all the things I I will write is the result of my own experiments, and therefore, is not the ultimate truth. It's just my thoughts. Thus, if something goes wrong, I'm not guilty:). In article will use excerpts from the documentation zhpegu by Ceryx, as well as optimized and terribly mutilated:)) pieces code from source code pasovskogo zhpeg decoder by Alex Abramov. There are, however, little is left of it, but in any case it i used as the base. This material is not meant to but vichkov - at least, require knowledge of Pascal. Entry said, now go directly to the case. Decoding zhpega can be divided into two stages. 1-I - Scan zhpega, processing and loading of the markers table persons. This stage of initialization, it is carried on setting the decoder to the appropriate zhpeg, all of which I will describe in first. 2-I - Just the process of decoding, it will find further in the text. For a better understanding of algorithms, some places will lead pieces pasovskogo source. A bit of theory. JPEG is a packed frame photorealistic images, that is, it is targeted mainly at compression of color images with 24 bit color depth (up to colors transformation implied by 8 bits per color component the component of RGB). To understand how to decode zhpeg briefly describe the compression frame. Frame is divided into blocks of 8x8. Over each block is DCT (Discrete Cosine Transform), thereby going transformation of the brightness data from time domain into the frequency domain. Then, the frequency matrix kvantiziruetsya, with about proceeds optimization frequencies. Actually, at this stage and the Origin of goes compression due to excessive high-frequency rejection information. Further, all members of the matrix are drawn in a single chain zigzag and coded by RLC (Zero Run Length Coding). Final stage - Huffman coding, which resulted in our complete 8x8 block is just packed a handful of bits. The process of deco coding is performed in reverse order of encoding. Of course, I described a general scheme for the compression process, but I think as long as this enough. We will need some more ideas. Colors in zhpege stores not as RGB, and in the format of YCbCr: Y - component of the brightness, Cb / Cr - color difference components, roughly indicates how many blue and the red in color. Thus, if we not interested in color, you can extract only the Y component. Also on text will appear designation DC / AC. In a communication received by us vector of 64 elements necessary for the subsequent transformation tion of monetary policy, the first element with offset 0 is the DC - it so to speak, the zero frequency, the background brightness, and all after following 63 elements - AC. This is necessary because different coefficients coefficients are encoded differently. 0-I frequency, as a rule, I tsya weak, so the coefficient itself is not encrypted, and the difference in metal waiting for this and the previous DC coefficient. AC has to encode as it is, there is the frequency change is essential for the whole frame. Zhpeg is a file divided into parts - the segments. That's what it represents segment: - Header (4 bytes): $ Ff segment identifier n type of segment (1 byte) sh, sl size segment, including those 2 bytes but not including the $ FF and the type of segment. Not Intel'ovskom, and Motorol'ovskom order: byte first, the younger the last! - The contents of the segment, max. 65,533 bytes. At the beginning of the segment is a marker - a definite mark: the first byte is always FF, the next - the type of segment. The JPEG format uses motorolovskoy format for word, that is the most significant byte is the word first, the second youngest. We present the main markers that we will need: D8 - SOI Start Of Image C0 - SOF0 Start Of frame (baseline) C2 - SOF2 Start Of frame (progressive) C4 - DHT Define Huffman table DB - DQT Define Quantization table DD - DRI Define Restart Interval DA - SOS Start Of Scan D9 - EOI End Of Image Describe a little bit more tokens: D8, D9 = start, end file; C0, C2 = define the basic parameters of the frame (resolution, CEE tnost, table); C4 = Huffman table (needed for decoding the bit flow); DB = quantization table (needed for the process dekvantizatsii); DD = define restart interval (rarely used in decoder); DA = the beginning of scan (a marker of the start neposre dstvenno compressed data itself zhpega). So as not to clutter up your heads, dear readers, I will not delve into the algorithms zhpega packing, I'll do this later. Let me just say that everything that we need to first do - it scan the file from the beginning of the SOI marker to marker SOS, simultaneously initiating the appropriate variables, and tables. March Coeur SOS determines the beginning of packed data zhpega, and all that goes after him, is already in the process of decoding, it consider further. The scanning process begins with reading zhpega marker SOI. If at the beginning of the file it does not, then it's not zhpeg and can be safely pre terminates chtenie.Srazu for marker followed by 2 bytes of the segment length, with the exception of SOI and EOI, they have a segment missing. Here's the basic scan cycle: ... Repeat BlockRead (PictureFile, v, 2); if Lo (v) <> $ FF then begin WriteLn ('Invalid file format'); Close (PictureFile); Halt end; b: = Hi (v); Marker [b]: = True; if (b <> $ D8) and (b <> $ D9) then begin BlockRead (PictureFile, v, 2); FilePtr: = Swap (v) -2; Case b of $ C0, $ C2: ... {Main Image Parameters} $ C4: ... {Read 'compile Huffman Tables} $ DA: ... {Start Of Scan} $ DB: ... {Define Quantization Tables} $ DD: ... {Define Restart Interval} End; while FilePtr <> 0 do begin BlockRead (PictureFile, v, 1); dec (FilePtr) end; if IOResult <> 0 then begin WriteLn ('I / O error!'); Close (PictureFile); Halt end end Until (b = $ D9) or (b = $ DA); ... BlockRead - read from the file specified number of bytes Lo / Hi - selects junior / senior bytes Swap - older than me and the low byte of places All other markers and their segments, respectively, the transmission repent. Scanning the markers is performed until such time until meet SOS. This suggests that all the preparatory opera radios are made and followed by bit data stream itself zhpega. We now consider themselves more processing markers. From We shall assume going in that order: first, a full format the corresponding segment, then the code snippet, and then comment. While I will give a brief description, a more detailed look at all on. So, if suddenly you something that will clear the Council has skip this place and read on. SOF0, SOF2: Getting a shot: ~~~~~~~~~~~~~~~~~~~~~~~~ - $ Ff, $ c0 (SOF0) - Length (high, low byte), 8 + * 3 kol.komponent - Accuracy of data (1 byte) in the format bits / cell, typically 8 - Height zhpega (2 bytes, St-MA) - Zhpega width (2 bytes, St-MA) - Kol.komponent (1 byte): usually 1 = grayscale, 3 = color YCbCr - For each component: 3 bytes - ID of the component (1 = Y, 2 = Cb, 3 = Cr) - Sampling factor (bit 0-3 vert., 4-7 Mountain). - Quantization table number ... $ C0, $ C2: begin vv: = ReadByte; {Main Image Parameters} Height: = ReadWord; Width: = ReadWord; planes: = ReadByte; if (vv <> 8) or ((planes <> 1) and (planes <> 3)) then begin WriteLn ('Only 8-bit Grayscale / RGB images supported'); Close (PictureFile); Halt end; For hh: = 0 to planes-1 do begin CmpID [hh]. C: = ReadByte; vv: = ReadByte; CmpID [hh]. H: = Hi4 (vv); CmpID [hh]. V: = Lo4 (vv); CmpID [hh]. T: = ReadByte end; method: = b end; ... ReadByte / ReadWord - read byte / word from a file Lo4/Hi4 - highlights the Junior / Senior Part bytes At first, followed by: the bit data (typically 8 bits, the remaining values can not handle), height, width of the image at the peak villages, the number of components (specifies the type of image: 1 = black but-and-white, 3 = color). Next to each component implies 3 buy that: the type of components (1 = Y, 2 = Cb, 3 = Cr); sampling factor; room table Litsa quantization. All these parameters must be stored in the corresponding corresponding variables and arrays, you will need it later. DHT: Define Huffman Table (TX): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - $ Ff, $ c4 (DHT) - Length (high, low byte) - Information TX bytes: Bit 0 .. 3: number of Tx (0 .. 3, otherwise error) bit 4: type TX, 0 = DC table, 1 = AC table bit 5 .. 7: not used = 0 - 16 bytes: number of characters with codes of length 1 .. 16, the amount these bytes is the total number of codes that must be <= 256 - N bytes: A table containing the characters in order of increasing code length (n = total number of codes) Comment: - One DHT segment may contain multiple tables each with its own information byte. ... $ C4: begin Repeat {Read 'compile Huffman Tables} hh: = ReadByte; For vv: = 0 to 15 do HT.L [vv]: = ReadByte; aa: = 0; For vv: = 0 to 15 do For m: = 1 to HT.L [vv] do begin HT.V [aa]: = ReadByte; inc (aa); end; c: = 0; aa: = 0; For vv: = 0 to 15 do begin if HT.L [vv]> 0 then begin HT.H2 [Hi4 (hh), Lo4 (hh), vv]. SV: = aa-c; HT.H2 [Hi4 (hh), Lo4 (hh), vv]. EV: = aa + HT.L [vv]; end; For m: = 1 to HT.L [vv] do begin HT.H1 [Hi4 (hh), Lo4 (hh), aa]. V: = HT.V [aa]; if vv <8 then begin HT.H1 [Hi4 (hh), Lo4 (hh), c]. L: = vv +1; HT.H1 [Hi4 (hh), Lo4 (hh), c]. LV: = HT.V [aa]; end; inc (aa); inc (c) end; c: = c shl 1; end; Until FilePtr = 0; end; ... It is more difficult. The fact is that not enough just to get these tables. It is also necessary to convert them into easy-to-pa Bots decoder format. Therefore, we dwell on this detail. First, a little theory. Huffman is a statistical coding, that is, characters with a large number of entries in the file assigned a code with a lower bit depth, with a smaller number - bo Most bit. Thus is formed a character alphabet with disproportionately long code assigned to it. Due to this compression is achieved by groups of characters. Resulting in a output bit stream data. For successful decoding Bito flux must be concordance of characters and their codes of appropriate length. We are interested in codes of lengths 0 to 15, ie 16 bits maximum. Back to our bit of code. In the beginning is the Information LIMITED bytes in it: bits 0 .. 3 - Huffman table number, bit 4 - type Table 0 = DC / 1 = AC. Followed by 16 bytes, which describe the number of characters with a length codes from 1 to 16, the sum of these bytes is the total number of codes and must not exceed 256. Then are symbols in order of increasing code length. Attention! Go characters, but their codes. That is code we still need them to set group it. Single DHT segment may have a few tables, each with its information byte. Total number of such tables can be 8: 4 for 4 for DC and AC. We have a symbol table and their lengths. Now we need to define the Huffman codes for each character. This is done by the following algorithm: first start code c = 0; order go through all the symbols of our table of length 1 16, at each iteration we increase the code value per unit; when the code length of code multiplied by 2, which is equivalent to shift code by one digit to the left. As a result, we have a full table face of all the characters and their corresponding Huffman codes. That with her to do next, it becomes clear later. While we need to just keep all this data. DQT: Define Quantization Table (TC): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - $ Ff, $ db (DQT) - Length (high, low byte) - Information bytes TK: Bit 0 .. 3: number of TC (0 .. 3, otherwise error) Bit 4 .. 7: precision of the TC 0 = 8 bits, or 16 bits - N bytes of TC, n = 64 * (precision +1) Comment: - One DQT segment may contain multiple tables each with its own information byte. - For exactly 1 (16 bit) sequence - a senior, then Jr. (for each of the 64 words.) ... $ DB: begin Repeat {Define Quantization Tables} hh: = ReadByte; For vv: = 0 to $ 3F do if Hi4 (hh) = 0 then qtmp [vv]: = ReadByte; for m: = 0 to 63 do Quant [Lo (hh), m]: = qtmp [z2t [m]]; for v: = 0 to 7 do for w: = 0 to 7 do begin if w = 0 then cw: = frac else cw: = round (frac * cos ((w * PI) / 16) * sqrt (2)); if v = 0 then cv: = frac else cv: = round (frac * cos ((v * PI) / 16) * sqrt (2)); cw: = (cw * cv) shr prec; Quant [Lo (hh), v * 8 + w]: = mul1 (Quant [Lo (hh), v * 8 + w], cw); end; Until FilePtr = 0; end; ... Quantization tables are needed to restore the frequency matrices and have dimensions 8x8, that is all of these coefficients defects in one matrix will be 64. Listing: First read information bytes, there bits 0 .. 3 - quantization table number from 0 to 3, bits 4 .. 7 - bit elements of the matrix (0 = 8 bits otherwise 16-bit). Next reads and scaling elements ntov. One DQT segment may contain multiple tables quantum tion, each with its own information byte. Most zhpegov designed for 8-bit table. DRI: Define restart interval: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - $ Ff, $ dd (DRI) - Length (high, low byte) = 4 - Restart interval (high, low byte) in units MCU blocks - this means that every n MCU blocks may be found marker RSTn. The first delivery has to be a marker RST0, then RST1, and so on, until RST7, then again RST0. ... $ DD: begin RestartInterval: = ReadWord end ... All that needs to be done - it keep it in a variable. I note that they occur very rarely. SOS: start a scan: ~~~~~~~~~~~~~~~~~~~~~~~~~ - $ Ff, $ da (SOS) - Length (high, low byte) 6 +2 * (kol.komponent scan) - The number of components of the scan (1 byte) must be> = 1 and <= 4 (otherwise error), usually 1 or 3 - For each component: 2 bytes - ID of the component (1 = Y, 2 = Cb, 3 = Cr), watch SOF0 - Huffman table is used: - Bits 0 .. 3: AC table (0 .. 3) - Bit 4 .. 7: DC table (0 .. 3) - 3 bytes should be skipped (???) Comment: - Data zhpega immediately follow the SOS segment. $ DA: begin m: = ReadByte; {Start Of Scan} For hh: = 0 to m-1 do begin Scan.Cmp [hh]: = ReadByte; vv: = ReadByte; Scan.TD [hh]: = Hi4 (vv); Scan.TA [hh]: = Lo4 (vv) end; Scan.Ss: = ReadByte; Scan.Se: = ReadByte; vv: = ReadByte; Scan.Ah: = Hi4 (vv); Scan.Al: = Lo4 (vv) end; Followed by the number of scan component - typically 1 or 3. Next to each component of the next 2 bytes: 1 st - iDEN tifikator components (1 = Y, 2 = Cb, 3 = Cr), 2 nd - No. used Huffman table, here bits 0 .. 3 = AC table (0 .. 3), 4 .. 7 = DC table (0 .. 3). Then there are the 3 bytes that need transmission tit. As mentioned earlier, this token is the last him, followed immediately follow the compressed data zhpega. So, all preparations are done, now need to move directly to the process of decoding zhpega.Dlya start explain how the compressed data is located. Information in zhpege storage nitsya blocks of 8x8, that is, 64 bytes for each of the components Y / Cb / Cr. Although this particular case, when the sampling factor is 1:1:1, but about this later. Immediately after the Y component is followed Cb and Cr, in such a Thus, we have only 3 * 64 bytes per block 8x8 image. Units starting from the top left corner of the image and go to the left right and from top to bottom. That is, we gradually went down. In the end of the handle end is bitstrima zhpega EOI, which we not necessarily track, because we already know how much the peak mudflows, and, respectively, and blocks in our zhpege. All byte alignment markers shall fill the remaining bi Comrade units, so if the stream meets bytes FF, it must pass. General list of stages of decoding is as follows: 1) Huffman decoder (decoding of DC / AC coefficients) 2) Dekvantizatsiya vector of 64 elements 3) Zig-Zag sorting and recovery of 8x8 block 4) Application to the block ODKP Repeat the first 4 stages for each block of 8x8, each of whom component of the image Y / Cb / Cr. May) Scaling Cb / Cr 6) Convert the level of 7) Convert YCbCr-> RGB All these stages are described only decode a single block of peak mudflows MCU. For the rest you must repeat these steps, along the way reading data from a file and copying them to the appropriate place on the screen or buffer. We consider in detail each stage. 1. Huffman decoder All data is encoded Huffman, this is achieved by a finite compression zhpega. What is this kind of coding? As I already written, each encoded symbol is associated code Huffman, depending on the frequency of occurrence of a character in the stream. The lower the probability of occurrence of the symbol, the greater the length of the code it is assigned, and vice versa. Thus, a so-called disproportionate coding. Due to this is the optimal zation of redundancy. As a result, we have a bit stream (bit Stream). Since the data are a bit structure, and the current code is not known how long, our disk driver must be able to read data sequentially bit by bit. Furthermore, at each iteration radios must add bits to the existing and verify the corresponding correspondence from the tables Haffmana.Esli code is found, the decoded value is retained, otherwise continues decoding. Rises the question of how you can quickly find the current code in the table? First, we present the procedure for reading the bit stream: procedure NextBit (var V: byte); begin V: = (V shl 1) + Read1bit end; function Read1bit: byte; {Take one bit from Current Byte} begin if Current_bit = 0 then begin ReadByte; if Current_byte = $ FF then begin Repeat ReadByte Until Current_byte <$ FF; if (Current_byte> = $ D0) and (Current_byte <= $ D7) then begin FillChar (DC, SizeOf (DC), 0); ReadByte; end; if Current_byte = 0 then Current_byte: = $ FF; end end; Read1bit: = (Current_byte shr 7) and 1; Current_byte: = Current_byte shl 1; inc (Current_bit); if Current_bit = 8 then Current_bit: = 0 end; As can be seen, the procedure NextBit just adds another bit to the variable V. Read1bit function returns the next bit schitanye from the stream. It also skips the byte FF and initializes all DC coefficients, if the marker is found RST0-RST7 (D0-D7). We now turn to the point, the decoder: function hd (T, C: byte): byte; {Decoding Huffman Code from bitstream} var v, code: byte; begin v: = 0; {L - HuffCode len; L = 0 - no code; L = len +1 (1 .. 8) lookup} NextBit (v); if HT.H1 [T, C, v]. L = 1 then begin hd: = HT.H1 [T, C, v]. LV; exit end; NextBit (v); if HT.H1 [T, C, v]. L = 2 then begin hd: = HT.H1 [T, C, v]. LV; exit end; NextBit (v); if HT.H1 [T, C, v]. L = 3 then begin hd: = HT.H1 [T, C, v]. LV; exit end; NextBit (v); if HT.H1 [T, C, v]. L = 4 then begin hd: = HT.H1 [T, C, v]. LV; exit end; NextBit (v); if HT.H1 [T, C, v]. L = 5 then begin hd: = HT.H1 [T, C, v]. LV; exit end; NextBit (v); if HT.H1 [T, C, v]. L = 6 then begin hd: = HT.H1 [T, C, v]. LV; exit end; NextBit (v); if HT.H1 [T, C, v]. L = 7 then begin hd: = HT.H1 [T, C, v]. LV; exit end; NextBit (v); if HT.H1 [T, C, v]. L = 8 then begin hd: = HT.H1 [T, C, v]. LV; exit end; {SV - Start Value (aa-w); EV - Next Code Len Value} NextBit (v); code: = v + HT.H2 [T, C, 8]. SV; if code <HT.H2 [T, C, 8]. EV then begin hd: = HT.H1 [T, C, code]. V; exit end; NextBit (v); code: = v + HT.H2 [T, C, 9]. SV; if code <HT.H2 [T, C, 9]. EV then begin hd: = HT.H1 [T, C, code]. V; exit end; NextBit (v); code: = v + HT.H2 [T, C, 10]. SV; if code <HT.H2 [T, C, 10]. EV then begin hd: = HT.H1 [T, C, code]. V; exit end; NextBit (v); code: = v + HT.H2 [T, C, 11]. SV; if code <HT.H2 [T, C, 11]. EV then begin hd: = HT.H1 [T, C, code]. V; exit end; NextBit (v); code: = v + HT.H2 [T, C, 12]. SV; if code <HT.H2 [T, C, 12]. EV then begin hd: = HT.H1 [T, C, code]. V; exit end; NextBit (v); code: = v + HT.H2 [T, C, 13]. SV; if code <HT.H2 [T, C, 13]. EV then begin hd: = HT.H1 [T, C, code]. V; exit end; NextBit (v); code: = v + HT.H2 [T, C, 14]. SV; if code <HT.H2 [T, C, 14]. EV then begin hd: = HT.H1 [T, C, code]. V; exit end; NextBit (v); code: = v + HT.H2 [T, C, 15]. SV; if code <HT.H2 [T, C, 15]. EV then begin hd: = HT.H1 [T, C, code]. V; exit end; hd: = $ ff; end; I want to give you the results of their experiments in this area. Once again, I warn you that almost everything about what I will speak and say before, is the result of my own houses words, so may differ from the conventional methods and forms lirovok. As it turned out, most of the codes does not exceed 8 bits, thus, you can create a 256-byte conversion table. In this case, decoding is extremely fast: everything what we need - just take from the table ready-made meaning. In If the code is> 8 bit, here a little more complicated. We need to know all initial positions and final positions SV EV for long codes 8 .. 16. That is necessary to create plate values, or rather, three plaques. The first series will contain all of our encoded Misformed characters, let's call it Table V. I'll tell you how to generate the other two. For each code length of 8 .. 16 to set the initial nuyu position SV = offset of the first code in Table V minus itself the first code. For example, we have a code of% 110 = 6, he goes under the of numbers rum 5 in the table, then SV = 5 - 6 = -1. The third table should contain the end position of EV for the current code length. As in the previous case, for all lengths of codes of 8 .. 16 need to ask EV = displacement of the first code in Table V, plus the number of codes this length. According to the previous example, if the current number of codes ing length L = 4, then EV = 5 + 4 = 9. All of this was given early hour above in a piece of code marker processing DHT. Now explain why this is all nuzhno.V accordance with our tables, as shown in the above snippet, the search values to Yes, proceed as follows obrazom.Dlya appropriate length of code: add up the current code and SV, code = v + SV; if code <EV, then extract from Table V decoded symbol, he goes on the bias Niya code, or, equivalently, hd = V [code]. Ed.: As the bits in the stream have to read in any metal Tode Huffman decoding, it is easier (and, oddly enough, faster, if you have a conversation about the processor Z80) decoding codes neposre dstvenno the tree, bit by bit, without using additional tables. Appropriate procedures you can borrow from unpacker smallunr.H (см.в Appendix). In ZXUnRar decoding of the same is one bit, but for this pre-generated parsing Huffman codes based on the current tree, which is why to get even more high-speed decoding. If you think that the process of decoding the coefficients on This ends, you're wrong - it is just beginning:). So far we have only the decoded bytes of which must be obtain the coefficients of DC / AC. In addition to all to increase efficiency efficiency of compression has been added to RLC compression sequence zeros. Let's see how to decode these coefficients. Decoding of DC coefficients is performed by the following algebraic algorithm: At the beginning of DC = 0 a) retrieving the corresponding Huffman code (check in Huffman Table for DC) b) Consider how to categorize this code belongs to c) read N = bits category, and determine the value of Diff = = (B, N bits) d) DC = DC + Diff e) write DC in the 64-element vector: vector [0] = DC Decoding of AC coefficients is performed by the following algorithm: For each AC coefficient is not met and the EOB AC_counter <= 64 a) retrieving the corresponding Huffman code (check Huffman table for AC) b) decode Huffman code in accordance with (kol_pred_0, category) c) read N = bits category, and determine the value of AC = = (B, N bits) d) writing in the 64 element vector sequence of zeros = = Kol_pred_0 e) increase in AC_counter kol_pred_0 e) write the AC in a 64-element vector: vector [AC_counter] = AC The code snippet for reading coefficients DC / AC looks like follows: hb: = HD (0, Scan.TD [b]); vec [0]: = DC [b] + Bits2Integer (Lo4 (hb), ReadBits (Lo4 (hb))); DC [b]: = vec [0]; xx: = 1; if method <> $ C2 then Repeat hb: = HD (a, Scan.TA [b]); if hb = 0 then Repeat vec [xx]: = 0; inc (xx) Until xx> = 64 else begin yy: = Hi4 (hb); for m: = 1 to yy do begin vec [xx]: = 0; inc (xx) end; vec [xx]: = Bits2Integer (Lo4 (hb), ReadBits (Lo4 (hb))); inc (xx) end Until xx> = 64; Explain in more detail. First, we define DC. To do this, decode Diff. It is encoded by two elements (F, Nbit). In start going 4-bit (notebooks) category, representing the length of the read code, which is encoded by Huffman. That is first decode it, and then, knowing the length of the code Diff, read N bits. Next comes the conversion of N bits in the signed word for the following rules: Value Category Bits 0 0 - -1,1 1 0,1 -3, -2,2,3 2 00,01,10,11 -7, -6, -5, -4,4,5,6,7 3 000,001,010,011,100,101,110,111 -15 ..- 8.8 .. 15 April 0000 .. 0111.1000 .. 1111 -31 ..- 16.16 .. May 31 00 000 .. 01111.10000 .. 11111 -63 ..- 32.32 .. 63 6. -127 ..- 64.64 .. 127 7. -255 ..- 128.128 .. 255 8. -511 ..- 256.256 .. 511 9. -1023 ..- 512.512 .. 1023 10. -2047 ..- 1024.1024 .. 2047 11. -4095 ..- 2048.2048 .. 4095 12. -8191 ..- 4096.4096 .. 8191 13. -16383 ..- 8192.8192 .. 16383 14. -32767 ..- .. 16384.16384 32 767 15. The transformation takes the following code: function Bits2Integer (bits: byte; value: word): integer; begin if (value and (1 shl (bits-1))> 0) then Bits2Integer: = value else Bits2Integer: =- (value xor (a shl bits-1)); end; At the end determine the value of DC as the sum of the previous DC and found Diff. The final value is stored in the vector to zero Left bias. Now, how to determine the coefficients of AC. Here is more complicated - there may be several. Furthermore, additional uses coding sequence of zeros (RLC). For each element from 2 to 64 must decode the byte containing tetrads data (kol_pred_0, category), where kol_pred_0 = number leading zeros. Next, from the current position must be filled string vector with zeros in the number of kol_pred_0. Moreover, if B is (0,0), it is a sign the end of the block EOB, in this case the remaining elements of the vector filled with zeros, and at this filling equation of the vector ends. If not, the following Reading groups from Nbit bits and the conversion of AC coefficients cient, as in the previous case. Decoding of DC / AC coefficients must be performed by with corresponding Huffman tables. One more thing. There are two format of the data rate in zhpege. The first is called the baseline (Marker C0) about it and I wrote it all 64 of the coefficient vector go straight. Zhpegi of this type opened in a single pass from the top down. There is another format - progressive (marker C2). In it for one frame is read out only one factor, first DC, continue to consistently all AC. Thus a total scan divided into several consecutive scans. Koliches GUT coefficient depends on the quality of compression zhpega. To open the zhpega of this type requires frame buffer for storing the coefficients coefficients of DC / AC. The advantage of this type is the possibility see the image frame, without waiting for end of file. Reading the following lowing portions of the coefficients will only improve the quality of pictures ki, the frame will be like focus. Due to the complexity of progressive scan, I did not support it fully, making only read the first scan containing DC coefficients. 2. Dekvantizatsiya vector of the 64 items At this stage, you restore the optimized coefficient vector. This is done as follows. On stage of preparation was done reading all we need tab persons quantization. All we need now - just multiply all the elements of our vector to the corresponding table entries quasi ntizatsii. You can combine this stage with the stage ODKP (reverse DCT), as I have done myself. 3. Zig-Zag sorting and recovery of 8x8 block At the stage of compression, the translation unit in the 8x8 vector, the coefficients treated zigzag. This was necessary for better grouping sequence of zeros. Now we need to do the opposite surgery - to restore a block of 8x8 vector. The order following factors in the matrix: 0 1 5 6 14 15 27 28 2 4 7 13 16 26 29 42 3 8 12 17 25 30 41 43 9 11 18 24 31 40 44 53 10 19 23 32 39 45 52 54 20 22 33 38 46 51 55 60 21 34 37 47 50 56 59 61 35 36 48 49 57 58 62 63 That is, elements of the vector must be recorded in the cell matrix matrix, in accordance with their serial numbers. As a result, we we have fully restored the matrix for later, perhaps, most important of all, the stage. 4. Application to block ODKP This is the most interesting part of the decoding. ODKP (Reverse Dis discrete cosine transform) belongs to the family of transformations Fourier transform and converts the data from the frequency area in the interim. That is, at the entrance we have a matrix of frequencies after applying ODKP is the matrix of discrete values, or brightness of pixels. The main difficulty of this stage is that in fact the Fourier transform performed too slow slowly. I shall not here to paint all sorts of classical mathematical cal formulas and calculations on this subject can write an entire book, cite only thing in my opinion, the best solution. There is a family of fast Fourier transform algorithms - IFFT (inverse fast Fourier transform). From the set of data GOVERNMENTAL methods I chose the scheme AA'N, as the fastest. Only minus this method - a small loss of accuracy, although the eye, I did not notice her. A fragment of code, assume that the matrix 1x8: ... {Even part} t0: = tout [i * 8 +0]; t1: = tout [i * 8 +2]; t2: = tout [i * 8 +4]; t3: = tout [i * 8 +6]; t10: = t0 + t2; t11: = t0-t2; t13: = t1 + t3; t12: = (t1-t3) * (2 * c4)-t13; t0: = t10 + t13; t3: = t10-t13; t1: = t11 + t12; t2: = t11-t12; {Odd part} t4: = tout [i * 8 +1]; t5: = tout [i * 8 +3]; t6: = tout [i * 8 +5]; t7: = tout [i * 8 +7]; z13: = t6 + t5; z10: = t6-t5; z11: = t4 + t7; z12: = t4-t7; t7: = z11 + z13; t11: = (z11-z13) * (2 * c4); z5: = (z10 + z12) * (2 * c2); t10: = (2 * (c2-c6)) * z12-z5; t12: = (-2 * (c2 + c6)) * z10 + z5; t6: = t12-t7; t5: = t11-t6; t4: = t10 + t5; tout [i * 8 +0]: = t0 + t7; tout [i * 8 +7]: = t0-t7; tout [i * 8 +1]: = t1 + t6; tout [i * 8 +6]: = t1-t6; tout [i * 8 +2]: = t2 + t5; tout [i * 8 +5]: = t2-t5; tout [i * 8 +4]: = t3 + t4; tout [i * 8 +3]: = t3-t4; ... Here: tout [] - working matrix of 1x8 i - number of the current row Constants: c2 = cos (2 * PI/16); c4 = cos (4 * PI/16); c6 = cos (6 * PI/16); This code should go first for all lines on Shay 8x8 matrix, and then over all columns. Receive 16 iterations tions: 8 + 8 on the line into columns. In processing the columns in front the final result should be recorded to divide it into 8, this requires a specific method. There is another subtlety, without which algorithm will not work. Before processing the initial matrix must be multiplied by a constant. Here's how it will look like: ... for j: = 0 to 7 do for i: = 0 to 7 do begin if i = 0 then ci: = 1 else ci: = cos ((i * PI) / 16) * sqrt (2); if j = 0 then cj: = 1 else cj: = cos ((j * PI) / 16) * sqrt (2); tout [j * 8 + i]: = tin [j * 8 + i] * ci * cj; end; ... As can be seen if the item number is not zero, it is necessary to multiply this element in the cos ((i * PI) / 16) * sqrt (2), otherwise the unit, the same and j. These tweaks are made to reduce the number of multiplication tions in the processing cycle. If the pre-multiplied data constants of quantization tables and combine stages 2 and 4, is included in the IFFT dekvantizatsiyu, you can win a few short STI. This was done earlier in the processing of marker DQT, QS third piece of code. All the above steps enable us to obtain the coefficients a lko one component (Y / Cb / Cr). Therefore, the first four stages must be repeated for each component, of course, if zhpeg Full color. The following is a description of the stages after the decoding tion of all 3 components. 5. Scaling Cb / Cr As a result of the previous stages had received information about 3 component Y / Cb / Cr. That is 3 blocks 8x8, describing the pixels image. In fact, this is a special case when scale (sampling factor) component Y / Cb / Cr = 1:1:1, but it happens not always. Often the scale of the components taken 2:1:1, which means that This means that the brightness of Y 2 cells have an element of color surface Cb / Cr. Same thing happens on the other coordinate, then There are at X, and Y. This data is loaded before processing marker SOS. There is a concept of minimum coded unit - MCU (Minimum Coded Unit), which describes the image block. When sampling 1:1:1 MCU factor is equal to 8x8. With 2:1:1 MCU is 16x16. In the second case, it turns out that the data Y components in the 4 times greater than for Cb / Cr. If you submit a 8x8 block as DU (DataUnit), the latter case can be written as: YDU, YDU, YDU, YDU, CbDU, CrDU. Into 4 blocks of data for luminance Y has to one block chrominance Cb / Cr. This assumption allows us to obtain even greater compression with almost imperceptible quality loss pictures. Having said that, for each component must be also take into account the scale and comply fully load MCU. Data blocks of 64 elements is located in the MCU from left to right, top to bottom. Once all information is uploaded tion of all the components necessary to perform resampling, that is scaled down, if necessary, the components of Cb / Cr. When sampling factor 2:1:1 to obtain 3 of the matrix elements of the 16x16. In 1:1:1 case all components are one to one, and scaling is unnecessary, MCU will be equal to 8x8. In principle, there are also Other variations, for example, Cb / Cr on X can be 2 unit (1:1:1), and the Y 1 (2:1:1). But such cases are extremely rarely, I did not fool them head and supported only two per curves. 6. Level conversion You must convert the value of our component of the sign in unsigned form. Make it very simple - all that is needed is to add 128 to all 8-bit signed values of our com components. At this stage, and adjusts the brightness and color balance. If the tables take into account the brightness and color immediately and significance level, then this transformation will be executed tsya automatically. 7. Converting YCbCr-> RGB This transformation is the final stage of decoding of and implement the transformation from color space YCbCr format screen pixel RGB. This is done by standard nym formulas: R = Y + 1.402 * (Cr-128) G = Y - 0.34414 * (Cb-128) - 0.71414 * (Cr-128) B = Y + 1.772 * (Cb-128) All values YCbCr - 8 bit unsigned. As a result, we have decoded pixel format image true color (8 bits per component). That's all what I wanted you rasskazat.Iznachalno, however, intended to write an article in the format spekovskogo asma, but given the very heavy source, had to abandon this undertaking and sign the example pasovskih fragments. Can would still write about scaling the resulting image of, about his handling of the output filters, but this is a topic of individual Noah article. And without that, I think, for further consideration threw enough. So, gentlemen, coders, learn, think and write high-quality decoders zhpega for our beloved Speccy. Ed.: The application is browser xjpeg by scor ^ 3umf and the source of the viewer (publication source does not mean that the author abandoned the project - before making any changes please contact the author at scorpzx@mail.ru).
Other articles:
Similar articles:
В этот день... 21 November