|
ast month we wrote a REXX program to extract the dimensions of a GIF image, i.e. the number of pixels in the vertical and horizontal directions. This month we'll do the same thing for a JPEG image. It would be more accurate to say that we will write code that gets the information from a JFIF, or JPEG File Interchange Format. JPEG is not really a file type, but a compression method. JFIF has become pretty much the standard and so the terms JPEG and JFIF are often interchanged. I will use "JPEG" in the following discussion.
JPEG files are a little more complicated to deal with than GIF files. JPEG files consist of segments, each of which starts with a two-byte marker that identifies the segment type. Some segments consist only of the marker, while others have data following the marker. This article will not be a tutorial on the JPEG format; I will explain only what is necessary to accomplish the task at hand. The two-byte marker always contains hexadecimal FF as the first byte. The second byte indicates the marker type.
JPEG files start out with a Start of Image, or SOI, segment. The SOI segment consists only of the marker, which is hex FF D8. We will use this to test a file to see if it is a JPEG image. We'll read in the first two bytes of the file using charin and then use the C2X() function to convert the bytes to hex. If the result is "FFD8", then we have a JPEG image.
/* Read width and height of a JPEG image */ Parse Arg FileName FileType=c2x(Charin(FileName,1,2)) /* Read first two bytes and convert to hex */ If FileType="FFD8" then do ...
Following the two-byte marker of some segment types is a two-byte length of the segment. By reading this length, we can jump ahead to the next segment if the current one is of no interest to us. After the SOI segment comes the Application Type 0, or APP0, segment. This segment is used to store any application data. The second byte of the marker is hex E0. Since this segment doesn't contain anything of interest to us, we will just skip by it. After the APP0 segment, things start getting a little more unpredictable. Segments can now come in different orders, so what we will do is scan through them until we hit the one we are looking for -- a Start of Frame, which has markers of either hex C0 or hex C2.
I mentioned earlier that some segments consist only of the marker, while others have data after the marker. The ones that have no data are of types hex 01 and hex D1 through hex D9. All other segments have the two-byte length after the marker, so we'll read that in and jump forward the correct number of bytes to get to the next segment.
Let's put the segment handling code in a subroutine of its own called ReadSegment. We'll pass the current segment position as an input and return the position of the next segment. If the current segment is an SOF0 segment, then we'll read in the image dimensions. In an SOF0 segment, the marker is followed by the segment length as usual. The next byte contains the sample precision (almost always 8). The height (number of rows of pixels) is given in the next two bytes. The two bytes following the height are the width (number of columns of pixels). These are, of course, what we want to find out about the image.
Here is the code that reads a segment, checks to see if it is an SOF0 segment, and reads the height and width of the images if it is:
ReadSegment: /* Read a JPEG segment's header */ Arg SegPos Marker=C2X(CharIn(FileName, SegPos)) If Marker<>"FF" Then Return -1 Type=C2X(CharIn(FileName)) Res=SegPos+2 /* position of next segment */ Select When Type="01" | Type>="D0" & Type<="D9" then /* no length to these */ len=0 otherwise len=c2d(charin(filename, , 2)) /* read the length of the segment */ end res=res+len if type="C0" | type="C2" then do /* this is what we are looking for */ /* start of frame 0 */ /* use c2d() to convert bytes into decimal form for human consumption */ imagebps=c2d(charin(filename)) /* bits per sample */ imageheight=c2d(charin(filename, , 2)) /* height of image */ imagewidth=c2d(charin(filename, , 2)) /* width of image */ end return res /* return position of next segment */
The main routine simply loops and calls to ReadSegment until an SOF0 frame is found:
/* Read width and height of a JPEG image */ Parse Arg FileName FileType=c2x(Charin(FileName,1,2)) If FileType="FFD8" then do NxtSeg=3 ImageHeight="IMAGEHEIGHT" Do While Type<>"D9" & NxtSeg<>-1 & Imageheight="IMAGEHEIGHT" NxtSeg=ReadSegment(NxtSeg) End rc=Stream(F,"C","Close") Say "Height:" ImageHeight Say "Width:" ImageWidth Say "Bits Per Sample:" ImageBPS end /* Do */ Else Do Say "This doesn't appear to be a JPEG file." Exit End
This program probably seems a little more difficult than the kinds of things we have done before. Well, it is. But it shows just how powerful REXX is. With the C2D and C2X functions, you can easily switch back and forth between different data types. With a little documentation on file types, you could write a general purpose program that would return information on many different types of image files. And if you were really ambitious, you could even write image manipulation routines, although you would probably find REXX a little slow for that. (A better approach would be to wrap some REXX code around compute-intensive routines written in a compiled language.)
In any case, I hope you found this exercise useful for learning a little more about that jewel of a language that comes with OS/2. For those who want to see the above utility in full, here is the complete listing of this month's sample program.
Dr. Dirk Terrell is an astronomer at the University of Florida specializing in interacting binary stars. His hobbies include cave diving, martial arts, painting and writing OS/2 software such as HTML Wizard.
Copyright © 1997 - Falcon Networking | ISSN 1203-5696 |