|
Inferno #10
30 апреля 2007 |
|
Formats - details on the decoder jpeg.

Hello! Remember me? :) Since the topic of this article
interest to many, I'm not thinking long, he decided to scratch
an article. Despite all the apparent complexity, I will try to
present all simple, understandable form. I want to warn: all
the things I I will write is the result of my own experiments,
and therefore, is not the ultimate truth. It's just
my thoughts. Thus, if something goes wrong, I'm not guilty:). In
article will use excerpts from the documentation zhpegu by
Ceryx, as well as optimized and terribly mutilated:)) pieces
code from source code pasovskogo zhpeg decoder by Alex Abramov.
There are, however, little is left of it, but in any case it i
used as the base. This material is not meant to but
vichkov - at least, require knowledge of Pascal. Entry
said, now go directly to the case.
Decoding zhpega can be divided into two stages.
1-I - Scan zhpega, processing and loading of the markers
table persons. This stage of initialization, it is carried on
setting the decoder to the appropriate zhpeg, all of which I
will describe in first.
2-I - Just the process of decoding, it will find
further in the text. For a better understanding of algorithms,
some places will lead pieces pasovskogo source.
A bit of theory. JPEG is a packed frame
photorealistic images, that is, it is targeted mainly at
compression of color images with 24 bit color depth (up to
colors transformation implied by 8 bits per color component
the component of RGB). To understand how to decode zhpeg briefly
describe the compression frame.
Frame is divided into blocks of 8x8. Over each block is
DCT (Discrete Cosine Transform), thereby going
transformation of the brightness data from time domain into the
frequency domain. Then, the frequency matrix kvantiziruetsya,
with about proceeds optimization frequencies. Actually, at this
stage and the Origin of goes compression due to excessive
high-frequency rejection information. Further, all members of
the matrix are drawn in a single chain zigzag and coded by RLC
(Zero Run Length Coding). Final stage - Huffman coding, which
resulted in our complete 8x8 block is just packed a handful of
bits. The process of deco coding is performed in reverse order
of encoding. Of course, I described a general scheme for the
compression process, but I think as long as this enough.
We will need some more ideas. Colors in zhpege stores
not as RGB, and in the format of YCbCr: Y - component of the
brightness, Cb / Cr - color difference components, roughly
indicates how many blue and the red in color. Thus, if we
not interested in color, you can extract only the Y component.
Also on text will appear designation DC / AC. In a
communication received by us vector of 64 elements necessary
for the subsequent transformation tion of monetary policy, the
first element with offset 0 is the DC - it so to speak, the
zero frequency, the background brightness, and all after
following 63 elements - AC. This is necessary because different
coefficients coefficients are encoded differently. 0-I
frequency, as a rule, I tsya weak, so the coefficient itself is
not encrypted, and the difference in metal waiting for this and
the previous DC coefficient. AC has to encode as it is, there
is the frequency change is essential for the whole frame.
Zhpeg is a file divided into parts - the segments.
That's what it represents segment:
- Header (4 bytes):
$ Ff segment identifier
n type of segment (1 byte)
sh, sl size segment, including those 2 bytes
but not including the $ FF and the type of
segment.
Not Intel'ovskom, and Motorol'ovskom order:
byte first, the younger the last!
- The contents of the segment, max. 65,533 bytes.
At the beginning of the segment is a marker - a definite
mark: the first byte is always FF, the next - the type of
segment. The JPEG format uses motorolovskoy format for word,
that is the most significant byte is the word first, the second
youngest.
We present the main markers that we will need:
D8 - SOI Start Of Image
C0 - SOF0 Start Of frame (baseline)
C2 - SOF2 Start Of frame (progressive)
C4 - DHT Define Huffman table
DB - DQT Define Quantization table
DD - DRI Define Restart Interval
DA - SOS Start Of Scan
D9 - EOI End Of Image
Describe a little bit more tokens:
D8, D9 = start, end file;
C0, C2 = define the basic parameters of the frame
(resolution, CEE tnost, table);
C4 = Huffman table (needed for decoding the bit
flow);
DB = quantization table (needed for the process
dekvantizatsii);
DD = define restart interval (rarely used in
decoder);
DA = the beginning of scan (a marker of the start neposre
dstvenno compressed data itself zhpega).
So as not to clutter up your heads, dear readers, I will not
delve into the algorithms zhpega packing, I'll do this later.
Let me just say that everything that we need to first do - it
scan the file from the beginning of the SOI marker to marker
SOS, simultaneously initiating the appropriate variables, and
tables. March Coeur SOS determines the beginning of packed data
zhpega, and all that goes after him, is already in the process
of decoding, it consider further.
The scanning process begins with reading zhpega marker SOI.
If at the beginning of the file it does not, then it's not
zhpeg and can be safely pre terminates chtenie.Srazu for marker
followed by 2 bytes of the segment length, with the exception
of SOI and EOI, they have a segment missing.
Here's the basic scan cycle:
...
Repeat
BlockRead (PictureFile, v, 2);
if Lo (v) <> $ FF then begin WriteLn ('Invalid file
format');
Close (PictureFile); Halt end;
b: = Hi (v); Marker [b]: = True;
if (b <> $ D8) and (b <> $ D9) then begin
BlockRead (PictureFile, v, 2); FilePtr: = Swap (v) -2;
Case b of
$ C0, $ C2: ... {Main Image Parameters}
$ C4: ... {Read 'compile Huffman Tables}
$ DA: ... {Start Of Scan}
$ DB: ... {Define Quantization Tables}
$ DD: ... {Define Restart Interval}
End;
while FilePtr <> 0 do begin
BlockRead (PictureFile, v, 1);
dec (FilePtr)
end;
if IOResult <> 0 then begin
WriteLn ('I / O error!');
Close (PictureFile);
Halt
end
end
Until (b = $ D9) or (b = $ DA);
...
BlockRead - read from the file specified number of bytes
Lo / Hi - selects junior / senior bytes
Swap - older than me and the low byte of places
All other markers and their segments, respectively, the
transmission repent. Scanning the markers is performed until
such time until meet SOS. This suggests that all the
preparatory opera radios are made and followed by bit data
stream itself zhpega. We now consider themselves more
processing markers. From We shall assume going in that order:
first, a full format the corresponding segment, then the code
snippet, and then comment. While I will give a brief
description, a more detailed look at all on. So, if suddenly
you something that will clear the Council has skip this place
and read on.
SOF0, SOF2: Getting a shot:
~~~~~~~~~~~~~~~~~~~~~~~~
- $ Ff, $ c0 (SOF0)
- Length (high, low byte), 8 + * 3 kol.komponent
- Accuracy of data (1 byte) in the format bits / cell,
typically 8
- Height zhpega (2 bytes, St-MA)
- Zhpega width (2 bytes, St-MA)
- Kol.komponent (1 byte): usually 1 = grayscale, 3 = color
YCbCr
- For each component: 3 bytes
- ID of the component (1 = Y, 2 = Cb, 3 = Cr)
- Sampling factor (bit 0-3 vert., 4-7 Mountain).
- Quantization table number
...
$ C0, $ C2: begin
vv: = ReadByte; {Main Image Parameters}
Height: = ReadWord;
Width: = ReadWord;
planes: = ReadByte;
if (vv <> 8) or ((planes <> 1) and (planes <> 3))
then begin
WriteLn ('Only 8-bit Grayscale / RGB images
supported');
Close (PictureFile);
Halt
end;
For hh: = 0 to planes-1 do begin
CmpID [hh]. C: = ReadByte; vv: = ReadByte;
CmpID [hh]. H: = Hi4 (vv);
CmpID [hh]. V: = Lo4 (vv);
CmpID [hh]. T: = ReadByte
end;
method: = b end;
...
ReadByte / ReadWord - read byte / word from a file
Lo4/Hi4 - highlights the Junior / Senior Part bytes
At first, followed by: the bit data (typically 8 bits, the
remaining values can not handle), height, width of the image
at the peak villages, the number of components (specifies the
type of image: 1 = black but-and-white, 3 = color). Next to
each component implies 3 buy that: the type of components (1 =
Y, 2 = Cb, 3 = Cr); sampling factor; room table Litsa
quantization. All these parameters must be stored in the
corresponding corresponding variables and arrays, you will need
it later.
DHT: Define Huffman Table (TX):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- $ Ff, $ c4 (DHT)
- Length (high, low byte)
- Information TX bytes:
Bit 0 .. 3: number of Tx (0 .. 3, otherwise error)
bit 4: type TX, 0 = DC table, 1 = AC table
bit 5 .. 7: not used = 0
- 16 bytes: number of characters with codes of length 1 ..
16, the amount
these bytes is the total number of codes that must be <=
256
- N bytes: A table containing the characters in order of
increasing
code length (n = total number of codes)
Comment:
- One DHT segment may contain multiple tables
each with its own information byte.
...
$ C4: begin
Repeat {Read 'compile Huffman Tables}
hh: = ReadByte;
For vv: = 0 to 15 do HT.L [vv]: = ReadByte;
aa: = 0;
For vv: = 0 to 15 do
For m: = 1 to HT.L [vv] do begin
HT.V [aa]: = ReadByte; inc (aa);
end;
c: = 0; aa: = 0;
For vv: = 0 to 15 do begin
if HT.L [vv]> 0 then begin
HT.H2 [Hi4 (hh), Lo4 (hh), vv]. SV: = aa-c;
HT.H2 [Hi4 (hh), Lo4 (hh), vv]. EV: = aa + HT.L
[vv];
end;
For m: = 1 to HT.L [vv] do begin
HT.H1 [Hi4 (hh), Lo4 (hh), aa]. V: = HT.V [aa];
if vv <8 then begin
HT.H1 [Hi4 (hh), Lo4 (hh), c]. L: = vv +1;
HT.H1 [Hi4 (hh), Lo4 (hh), c]. LV: = HT.V [aa];
end;
inc (aa); inc (c)
end;
c: = c shl 1;
end;
Until FilePtr = 0;
end;
...
It is more difficult. The fact is that not enough just to get
these tables. It is also necessary to convert them into
easy-to-pa Bots decoder format. Therefore, we dwell on this
detail. First, a little theory. Huffman is a statistical
coding, that is, characters with a large number of entries in
the file assigned a code with a lower bit depth, with a smaller
number - bo Most bit. Thus is formed a character alphabet with
disproportionately long code assigned to it. Due to this
compression is achieved by groups of characters. Resulting in a
output bit stream data. For successful decoding Bito
flux must be concordance of characters and their
codes of appropriate length. We are interested in codes of
lengths 0 to 15, ie 16 bits maximum.
Back to our bit of code. In the beginning is the Information
LIMITED bytes in it: bits 0 .. 3 - Huffman table number, bit 4
- type Table 0 = DC / 1 = AC. Followed by 16 bytes, which
describe the number of characters with a length codes from 1 to
16, the sum of these bytes is the total number of codes and
must not exceed 256. Then are symbols in order of increasing
code length. Attention! Go characters, but their codes. That is
code we still need them to set group it. Single DHT segment may
have a few tables, each with its information byte. Total number
of such tables can be 8: 4 for 4 for DC and AC. We have a
symbol table and their lengths. Now we need to define the
Huffman codes for each character. This is done by the following
algorithm: first start code c = 0; order go through all the
symbols of our table of length 1 16, at each iteration we
increase the code value per unit; when the code length of code
multiplied by 2, which is equivalent to shift code by one digit
to the left. As a result, we have a full table face of all the
characters and their corresponding Huffman codes. That with her
to do next, it becomes clear later. While we need to just keep
all this data.
DQT: Define Quantization Table (TC):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- $ Ff, $ db (DQT)
- Length (high, low byte)
- Information bytes TK:
Bit 0 .. 3: number of TC (0 .. 3, otherwise error)
Bit 4 .. 7: precision of the TC 0 = 8 bits, or 16 bits
- N bytes of TC, n = 64 * (precision +1)
Comment:
- One DQT segment may contain multiple tables
each with its own information byte.
- For exactly 1 (16 bit) sequence - a senior, then
Jr. (for each of the 64 words.)
...
$ DB: begin
Repeat {Define Quantization Tables}
hh: = ReadByte;
For vv: = 0 to $ 3F do
if Hi4 (hh) = 0 then qtmp [vv]: = ReadByte;
for m: = 0 to 63 do Quant [Lo (hh), m]: = qtmp [z2t
[m]];
for v: = 0 to 7 do
for w: = 0 to 7 do begin
if w = 0
then cw: = frac
else cw: = round (frac * cos ((w * PI) / 16) *
sqrt (2));
if v = 0
then cv: = frac
else cv: = round (frac * cos ((v * PI) / 16) *
sqrt (2));
cw: = (cw * cv) shr prec;
Quant [Lo (hh), v * 8 + w]: = mul1 (Quant [Lo
(hh), v * 8 + w], cw);
end;
Until FilePtr = 0;
end;
...
Quantization tables are needed to restore the frequency
matrices and have dimensions 8x8, that is all of these
coefficients defects in one matrix will be 64. Listing: First
read information bytes, there bits 0 .. 3 - quantization table
number from 0 to 3, bits 4 .. 7 - bit elements of the matrix (0
= 8 bits otherwise 16-bit). Next reads and scaling elements
ntov. One DQT segment may contain multiple tables quantum
tion, each with its own information byte. Most zhpegov
designed for 8-bit table.
DRI: Define restart interval:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- $ Ff, $ dd (DRI)
- Length (high, low byte) = 4
- Restart interval (high, low byte) in units
MCU blocks - this means that every n MCU blocks may
be found marker RSTn. The first delivery has to be a
marker RST0,
then RST1, and so on, until RST7, then again RST0.
...
$ DD: begin RestartInterval: = ReadWord end
...
All that needs to be done - it keep it in a variable.
I note that they occur very rarely.
SOS: start a scan:
~~~~~~~~~~~~~~~~~~~~~~~~~
- $ Ff, $ da (SOS)
- Length (high, low byte)
6 +2 * (kol.komponent scan)
- The number of components of the scan (1 byte)
must be> = 1 and <= 4 (otherwise error), usually 1 or 3
- For each component: 2 bytes
- ID of the component (1 = Y, 2 = Cb, 3 = Cr), watch SOF0
- Huffman table is used:
- Bits 0 .. 3: AC table (0 .. 3)
- Bit 4 .. 7: DC table (0 .. 3)
- 3 bytes should be skipped (???)
Comment:
- Data zhpega immediately follow the SOS segment.
$ DA: begin
m: = ReadByte; {Start Of Scan}
For hh: = 0 to m-1 do begin
Scan.Cmp [hh]: = ReadByte;
vv: = ReadByte;
Scan.TD [hh]: = Hi4 (vv); Scan.TA [hh]: = Lo4 (vv)
end;
Scan.Ss: = ReadByte;
Scan.Se: = ReadByte;
vv: = ReadByte;
Scan.Ah: = Hi4 (vv); Scan.Al: = Lo4 (vv)
end;
Followed by the number of scan component - typically 1
or 3. Next to each component of the next 2 bytes: 1 st - iDEN
tifikator components (1 = Y, 2 = Cb, 3 = Cr), 2 nd - No. used
Huffman table, here bits 0 .. 3 = AC table (0 .. 3), 4 .. 7 = DC
table (0 .. 3). Then there are the 3 bytes that need
transmission tit. As mentioned earlier, this token is the last
him, followed immediately follow the compressed data zhpega.
So, all preparations are done, now need to move
directly to the process of decoding zhpega.Dlya start
explain how the compressed data is located. Information in
zhpege storage nitsya blocks of 8x8, that is, 64 bytes for each
of the components Y / Cb / Cr. Although this particular case,
when the sampling factor is 1:1:1, but about this later.
Immediately after the Y component is followed Cb and Cr, in
such a Thus, we have only 3 * 64 bytes per block 8x8 image.
Units starting from the top left corner of the image and go to
the left right and from top to bottom. That is, we gradually
went down. In the end of the handle end is bitstrima zhpega
EOI, which we not necessarily track, because we already know
how much the peak mudflows, and, respectively, and blocks in
our zhpege. All byte alignment markers shall fill the remaining
bi Comrade units, so if the stream meets bytes FF, it
must pass.
General list of stages of decoding is as follows:
1) Huffman decoder (decoding of DC / AC coefficients)
2) Dekvantizatsiya vector of 64 elements
3) Zig-Zag sorting and recovery of 8x8 block
4) Application to the block ODKP
Repeat the first 4 stages for each block of 8x8, each of whom
component of the image Y / Cb / Cr.
May) Scaling Cb / Cr
6) Convert the level of
7) Convert YCbCr-> RGB
All these stages are described only decode a single block of
peak mudflows MCU. For the rest you must repeat these steps,
along the way reading data from a file and copying them to the
appropriate place on the screen or buffer. We consider in
detail each stage.
1. Huffman decoder
All data is encoded Huffman, this is achieved by a finite
compression zhpega. What is this kind of coding? As I
already written, each encoded symbol is associated code
Huffman, depending on the frequency of occurrence of a
character in the stream. The lower the probability of
occurrence of the symbol, the greater the length of the code it
is assigned, and vice versa. Thus, a so-called disproportionate
coding. Due to this is the optimal zation of redundancy. As a
result, we have a bit stream (bit Stream). Since the data are a
bit structure, and the current code is not known how long, our
disk driver must be able to read data sequentially bit by bit.
Furthermore, at each iteration radios must add bits to the
existing and verify the corresponding correspondence from the
tables Haffmana.Esli code is found, the decoded value is
retained, otherwise continues decoding. Rises the question of
how you can quickly find the current code in the table?
First, we present the procedure for reading the bit stream:
procedure NextBit (var V: byte); begin V: = (V shl 1) +
Read1bit end;
function Read1bit: byte; {Take one bit from Current Byte}
begin
if Current_bit = 0 then begin ReadByte;
if Current_byte = $ FF then begin
Repeat ReadByte Until Current_byte <$ FF;
if (Current_byte> = $ D0) and (Current_byte <= $ D7)
then begin
FillChar (DC, SizeOf (DC), 0); ReadByte;
end;
if Current_byte = 0 then Current_byte: = $ FF;
end
end;
Read1bit: = (Current_byte shr 7) and 1;
Current_byte: = Current_byte shl 1;
inc (Current_bit); if Current_bit = 8 then Current_bit: = 0
end;
As can be seen, the procedure NextBit just adds another bit
to the variable V. Read1bit function returns the next bit
schitanye from the stream. It also skips the byte FF and
initializes all DC coefficients, if the marker is found
RST0-RST7 (D0-D7). We now turn to the point, the decoder:
function hd (T, C: byte): byte;
{Decoding Huffman Code from bitstream}
var v, code: byte;
begin
v: = 0;
{L - HuffCode len; L = 0 - no code; L = len +1 (1 .. 8) lookup}
NextBit (v);
if HT.H1 [T, C, v]. L = 1 then begin hd: = HT.H1 [T, C, v].
LV; exit end;
NextBit (v);
if HT.H1 [T, C, v]. L = 2 then begin hd: = HT.H1 [T, C, v].
LV; exit end;
NextBit (v);
if HT.H1 [T, C, v]. L = 3 then begin hd: = HT.H1 [T, C, v].
LV; exit end;
NextBit (v);
if HT.H1 [T, C, v]. L = 4 then begin hd: = HT.H1 [T, C, v].
LV; exit end;
NextBit (v);
if HT.H1 [T, C, v]. L = 5 then begin hd: = HT.H1 [T, C, v].
LV; exit end;
NextBit (v);
if HT.H1 [T, C, v]. L = 6 then begin hd: = HT.H1 [T, C, v].
LV; exit end;
NextBit (v);
if HT.H1 [T, C, v]. L = 7 then begin hd: = HT.H1 [T, C, v].
LV; exit end;
NextBit (v);
if HT.H1 [T, C, v]. L = 8 then begin hd: = HT.H1 [T, C, v].
LV; exit end; {SV - Start Value (aa-w); EV - Next Code Len
Value}
NextBit (v); code: = v + HT.H2 [T, C, 8]. SV;
if code <HT.H2 [T, C, 8]. EV
then begin hd: = HT.H1 [T, C, code]. V; exit end;
NextBit (v); code: = v + HT.H2 [T, C, 9]. SV;
if code <HT.H2 [T, C, 9]. EV
then begin hd: = HT.H1 [T, C, code]. V; exit end;
NextBit (v); code: = v + HT.H2 [T, C, 10]. SV;
if code <HT.H2 [T, C, 10]. EV
then begin hd: = HT.H1 [T, C, code]. V; exit end;
NextBit (v); code: = v + HT.H2 [T, C, 11]. SV;
if code <HT.H2 [T, C, 11]. EV
then begin hd: = HT.H1 [T, C, code]. V; exit end;
NextBit (v); code: = v + HT.H2 [T, C, 12]. SV;
if code <HT.H2 [T, C, 12]. EV
then begin hd: = HT.H1 [T, C, code]. V; exit end;
NextBit (v); code: = v + HT.H2 [T, C, 13]. SV;
if code <HT.H2 [T, C, 13]. EV
then begin hd: = HT.H1 [T, C, code]. V; exit end;
NextBit (v); code: = v + HT.H2 [T, C, 14]. SV;
if code <HT.H2 [T, C, 14]. EV
then begin hd: = HT.H1 [T, C, code]. V; exit end;
NextBit (v); code: = v + HT.H2 [T, C, 15]. SV;
if code <HT.H2 [T, C, 15]. EV
then begin hd: = HT.H1 [T, C, code]. V; exit end;
hd: = $ ff;
end;
I want to give you the results of their experiments in this
area. Once again, I warn you that almost everything about what
I will speak and say before, is the result of my own houses
words, so may differ from the conventional methods and forms
lirovok. As it turned out, most of the codes does not exceed 8
bits, thus, you can create a 256-byte conversion table.
In this case, decoding is extremely fast: everything
what we need - just take from the table ready-made meaning. In
If the code is> 8 bit, here a little more complicated. We need
to know all initial positions and final positions SV EV for
long codes 8 .. 16. That is necessary to create plate values,
or rather, three plaques.
The first series will contain all of our encoded
Misformed characters, let's call it Table V. I'll tell you how
to generate the other two. For each code length of 8 .. 16 to
set the initial nuyu position SV = offset of the first code in
Table V minus itself the first code. For example, we have a
code of% 110 = 6, he goes under the of numbers rum 5 in the
table, then SV = 5 - 6 = -1. The third table should contain the
end position of EV for the current code length. As in the
previous case, for all lengths of codes of 8 .. 16 need to ask
EV = displacement of the first code in Table V, plus the number
of codes this length. According to the previous example, if the
current number of codes ing length L = 4, then EV = 5 + 4 = 9.
All of this was given early hour above in a piece of code
marker processing DHT.
Now explain why this is all nuzhno.V accordance with our
tables, as shown in the above snippet, the search values to
Yes, proceed as follows obrazom.Dlya appropriate length of code:
add up the current code and SV, code = v + SV; if code <EV, then
extract from Table V decoded symbol, he goes on the bias
Niya code, or, equivalently, hd = V [code].
Ed.: As the bits in the stream have to read in any metal
Tode Huffman decoding, it is easier (and, oddly enough, faster,
if you have a conversation about the processor Z80) decoding
codes neposre dstvenno the tree, bit by bit, without using
additional tables. Appropriate procedures you can borrow from
unpacker smallunr.H (см.в Appendix). In ZXUnRar decoding
of the same is one bit, but for this pre-generated
parsing Huffman codes based on the current tree, which is why
to get even more high-speed decoding.
If you think that the process of decoding the coefficients on
This ends, you're wrong - it is just beginning:).
So far we have only the decoded bytes of which must be
obtain the coefficients of DC / AC. In addition to all to
increase efficiency efficiency of compression has been added to
RLC compression sequence zeros. Let's see how to decode these
coefficients.
Decoding of DC coefficients is performed by the following
algebraic algorithm:
At the beginning of DC = 0
a) retrieving the corresponding Huffman code (check in
Huffman Table for DC)
b) Consider how to categorize this code belongs to
c) read N = bits category, and determine the value of Diff =
= (B, N bits)
d) DC = DC + Diff
e) write DC in the 64-element vector: vector [0] = DC
Decoding of AC coefficients is performed by the following
algorithm:
For each AC coefficient is not met and the EOB
AC_counter <= 64
a) retrieving the corresponding Huffman code (check
Huffman table for AC)
b) decode Huffman code in accordance with (kol_pred_0,
category)
c) read N = bits category, and determine the value of AC =
= (B, N bits)
d) writing in the 64 element vector sequence of zeros =
= Kol_pred_0
e) increase in AC_counter kol_pred_0
e) write the AC in a 64-element vector: vector [AC_counter] =
AC
The code snippet for reading coefficients DC / AC looks like
follows:
hb: = HD (0, Scan.TD [b]);
vec [0]: = DC [b] + Bits2Integer (Lo4 (hb), ReadBits (Lo4
(hb)));
DC [b]: = vec [0]; xx: = 1;
if method <> $ C2 then Repeat
hb: = HD (a, Scan.TA [b]);
if hb = 0 then Repeat vec [xx]: = 0; inc (xx) Until xx> =
64
else begin yy: = Hi4 (hb);
for m: = 1 to yy do begin vec [xx]: = 0; inc (xx) end;
vec [xx]: = Bits2Integer (Lo4 (hb), ReadBits (Lo4
(hb))); inc (xx)
end
Until xx> = 64;
Explain in more detail. First, we define DC. To do this,
decode Diff. It is encoded by two elements (F, Nbit). In
start going 4-bit (notebooks) category, representing
the length of the read code, which is encoded by Huffman. That
is first decode it, and then, knowing the length of the code
Diff, read N bits. Next comes the conversion of N bits in the
signed word for the following rules:
Value Category Bits
0 0 -
-1,1 1 0,1
-3, -2,2,3 2 00,01,10,11
-7, -6, -5, -4,4,5,6,7 3 000,001,010,011,100,101,110,111
-15 ..- 8.8 .. 15 April 0000 .. 0111.1000 .. 1111
-31 ..- 16.16 .. May 31 00 000 .. 01111.10000 .. 11111
-63 ..- 32.32 .. 63 6.
-127 ..- 64.64 .. 127 7.
-255 ..- 128.128 .. 255 8.
-511 ..- 256.256 .. 511 9.
-1023 ..- 512.512 .. 1023 10.
-2047 ..- 1024.1024 .. 2047 11.
-4095 ..- 2048.2048 .. 4095 12.
-8191 ..- 4096.4096 .. 8191 13.
-16383 ..- 8192.8192 .. 16383 14.
-32767 ..- .. 16384.16384 32 767 15.
The transformation takes the following code:
function Bits2Integer (bits: byte; value: word): integer;
begin
if (value and (1 shl (bits-1))> 0) then
Bits2Integer: = value
else
Bits2Integer: =- (value xor (a shl bits-1));
end;
At the end determine the value of DC as the sum of the
previous DC and found Diff. The final value is stored in the
vector to zero Left bias.
Now, how to determine the coefficients of AC. Here is more
complicated - there may be several. Furthermore, additional uses
coding sequence of zeros (RLC). For each element
from 2 to 64 must decode the byte containing tetrads
data (kol_pred_0, category), where kol_pred_0 = number
leading zeros. Next, from the current position must be filled
string vector with zeros in the number of kol_pred_0. Moreover,
if B is (0,0), it is a sign the end of the block EOB, in this
case the remaining elements of the vector filled with zeros,
and at this filling equation of the vector ends. If not, the
following Reading groups from Nbit bits and the conversion of
AC coefficients cient, as in the previous case.
Decoding of DC / AC coefficients must be performed by with
corresponding Huffman tables. One more thing. There are two
format of the data rate in zhpege. The first is called the
baseline (Marker C0) about it and I wrote it all 64 of the
coefficient vector go straight. Zhpegi of this type opened in a
single pass from the top down. There is another format -
progressive (marker C2). In it for one frame is read out only
one factor, first DC, continue to consistently all AC. Thus a
total scan divided into several consecutive scans. Koliches
GUT coefficient depends on the quality of compression zhpega.
To open the zhpega of this type requires frame buffer for
storing the coefficients coefficients of DC / AC. The advantage
of this type is the possibility see the image frame, without
waiting for end of file. Reading the following lowing portions
of the coefficients will only improve the quality of pictures
ki, the frame will be like focus. Due to the complexity of
progressive scan, I did not support it fully, making only read
the first scan containing DC coefficients.
2. Dekvantizatsiya vector of the 64 items
At this stage, you restore the optimized
coefficient vector. This is done as follows. On
stage of preparation was done reading all we need tab
persons quantization. All we need now - just multiply all the
elements of our vector to the corresponding table entries quasi
ntizatsii. You can combine this stage with the stage ODKP
(reverse DCT), as I have done myself.
3. Zig-Zag sorting and recovery of 8x8 block
At the stage of compression, the translation unit in the 8x8
vector, the coefficients treated zigzag. This was necessary for
better grouping sequence of zeros. Now we need to do the
opposite surgery - to restore a block of 8x8 vector. The order
following factors in the matrix:
0 1 5 6 14 15 27 28
2 4 7 13 16 26 29 42
3 8 12 17 25 30 41 43
9 11 18 24 31 40 44 53
10 19 23 32 39 45 52 54
20 22 33 38 46 51 55 60
21 34 37 47 50 56 59 61
35 36 48 49 57 58 62 63
That is, elements of the vector must be recorded in the cell
matrix matrix, in accordance with their serial numbers. As a
result, we we have fully restored the matrix for later, perhaps,
most important of all, the stage.
4. Application to block ODKP
This is the most interesting part of the decoding. ODKP
(Reverse Dis discrete cosine transform) belongs to the family
of transformations Fourier transform and converts the data from
the frequency area in the interim. That is, at the entrance we
have a matrix of frequencies after applying ODKP is the matrix
of discrete values, or brightness of pixels. The main
difficulty of this stage is that in fact the Fourier transform
performed too slow slowly. I shall not here to paint all sorts
of classical mathematical cal formulas and calculations on this
subject can write an entire book, cite only thing in my
opinion, the best solution. There is a family of fast Fourier
transform algorithms - IFFT (inverse fast Fourier transform).
From the set of data GOVERNMENTAL methods I chose the scheme
AA'N, as the fastest. Only minus this method - a small loss of
accuracy, although the eye, I did not notice her.
A fragment of code, assume that the matrix 1x8:
...
{Even part}
t0: = tout [i * 8 +0]; t1: = tout [i * 8 +2];
t2: = tout [i * 8 +4]; t3: = tout [i * 8 +6];
t10: = t0 + t2; t11: = t0-t2; t13: = t1 + t3;
t12: = (t1-t3) * (2 * c4)-t13;
t0: = t10 + t13; t3: = t10-t13;
t1: = t11 + t12; t2: = t11-t12;
{Odd part}
t4: = tout [i * 8 +1]; t5: = tout [i * 8 +3];
t6: = tout [i * 8 +5]; t7: = tout [i * 8 +7];
z13: = t6 + t5; z10: = t6-t5;
z11: = t4 + t7; z12: = t4-t7;
t7: = z11 + z13;
t11: = (z11-z13) * (2 * c4);
z5: = (z10 + z12) * (2 * c2);
t10: = (2 * (c2-c6)) * z12-z5;
t12: = (-2 * (c2 + c6)) * z10 + z5;
t6: = t12-t7; t5: = t11-t6; t4: = t10 + t5;
tout [i * 8 +0]: = t0 + t7; tout [i * 8 +7]: = t0-t7;
tout [i * 8 +1]: = t1 + t6; tout [i * 8 +6]: = t1-t6;
tout [i * 8 +2]: = t2 + t5; tout [i * 8 +5]: = t2-t5;
tout [i * 8 +4]: = t3 + t4; tout [i * 8 +3]: = t3-t4;
...
Here:
tout [] - working matrix of 1x8
i - number of the current row
Constants:
c2 = cos (2 * PI/16);
c4 = cos (4 * PI/16);
c6 = cos (6 * PI/16);
This code should go first for all lines on
Shay 8x8 matrix, and then over all columns. Receive 16
iterations tions: 8 + 8 on the line into columns. In processing
the columns in front the final result should be recorded to
divide it into 8, this requires a specific method. There is
another subtlety, without which algorithm will not work. Before
processing the initial matrix must be multiplied by a constant.
Here's how it will look like:
...
for j: = 0 to 7 do
for i: = 0 to 7 do begin
if i = 0 then ci: = 1 else ci: = cos ((i * PI) / 16) *
sqrt (2);
if j = 0 then cj: = 1 else cj: = cos ((j * PI) / 16) *
sqrt (2);
tout [j * 8 + i]: = tin [j * 8 + i] * ci * cj;
end;
...
As can be seen if the item number is not zero, it is
necessary to multiply this element in the cos ((i * PI) / 16) *
sqrt (2), otherwise the unit, the same and j. These tweaks are
made to reduce the number of multiplication tions in the
processing cycle. If the pre-multiplied data constants of
quantization tables and combine stages 2 and 4, is included in
the IFFT dekvantizatsiyu, you can win a few short STI. This was
done earlier in the processing of marker DQT, QS third piece of
code.
All the above steps enable us to obtain the coefficients a
lko one component (Y / Cb / Cr). Therefore, the first four
stages must be repeated for each component, of course, if zhpeg
Full color. The following is a description of the stages after
the decoding tion of all 3 components.
5. Scaling Cb / Cr
As a result of the previous stages had received information
about 3 component Y / Cb / Cr. That is 3 blocks 8x8, describing
the pixels image. In fact, this is a special case when
scale (sampling factor) component Y / Cb / Cr = 1:1:1, but it
happens not always. Often the scale of the components taken
2:1:1, which means that This means that the brightness of Y 2
cells have an element of color surface Cb / Cr. Same thing
happens on the other coordinate, then There are at X, and Y.
This data is loaded before processing marker SOS. There is a
concept of minimum coded unit - MCU (Minimum Coded Unit), which
describes the image block. When sampling 1:1:1 MCU factor is
equal to 8x8. With 2:1:1 MCU is 16x16. In the second case, it
turns out that the data Y components in the 4 times greater
than for Cb / Cr. If you submit a 8x8 block as DU (DataUnit),
the latter case can be written as: YDU, YDU, YDU, YDU, CbDU,
CrDU. Into 4 blocks of data for luminance Y has to one block
chrominance Cb / Cr. This assumption allows us to obtain even
greater compression with almost imperceptible quality loss
pictures. Having said that, for each component must be also
take into account the scale and comply fully load MCU. Data
blocks of 64 elements is located in the MCU from left to right,
top to bottom. Once all information is uploaded tion of all the
components necessary to perform resampling, that is scaled
down, if necessary, the components of Cb / Cr. When sampling
factor 2:1:1 to obtain 3 of the matrix elements of the 16x16.
In 1:1:1 case all components are one to one, and scaling
is unnecessary, MCU will be equal to 8x8. In principle, there
are also Other variations, for example, Cb / Cr on X can be 2
unit (1:1:1), and the Y 1 (2:1:1). But such cases are extremely
rarely, I did not fool them head and supported only two per
curves.
6. Level conversion
You must convert the value of our component of the sign
in unsigned form. Make it very simple - all that is needed is
to add 128 to all 8-bit signed values of our com components.
At this stage, and adjusts the brightness and color balance. If
the tables take into account the brightness and color
immediately and significance level, then this transformation
will be executed tsya automatically.
7. Converting YCbCr-> RGB
This transformation is the final stage of decoding
of and implement the transformation from color space
YCbCr format screen pixel RGB. This is done by standard
nym formulas:
R = Y + 1.402 * (Cr-128)
G = Y - 0.34414 * (Cb-128) - 0.71414 * (Cr-128)
B = Y + 1.772 * (Cb-128)
All values YCbCr - 8 bit unsigned. As a result, we have
decoded pixel format image true color (8
bits per component).
That's all what I wanted you rasskazat.Iznachalno,
however, intended to write an article in the format spekovskogo
asma, but given the very heavy source, had to abandon
this undertaking and sign the example pasovskih fragments. Can
would still write about scaling the resulting image
of, about his handling of the output filters, but this is a
topic of individual Noah article. And without that, I think,
for further consideration threw enough. So, gentlemen, coders,
learn, think and write high-quality decoders zhpega for our
beloved Speccy.
Ed.: The application is browser xjpeg by scor ^ 3umf and
the source of the viewer (publication source does not mean
that the author abandoned the project - before making any
changes please contact the author at scorpzx@mail.ru).
Other articles:
Similar articles:
В этот день... 25 October