'Check for corrupted JPEG files in Java
I need a fast Java way to check if a JPEG file is valid or if it's a truncated / corrupted image.
I tried to do it in a couple of ways:
using the javax.ImageIO library
public boolean check(File image) throws IOException { try { BufferedImage bi = ImageIO.read(image); bi.flush(); } catch (IIOException e) { return false; } return true; }
but it can detect only few corrupted files of the ones I have tested and it's very slow (on my PC around 1 image / second).
Apache Commons Imaging library
public boolean check(File image) throws IOException { JpegImageParser parser = new JpegImageParser(); ByteSourceFile bs = new ByteSourceFile(image); try { BufferedImage bi = parser.getBufferedImage(bs, null); bi.flush(); return true; } catch (ImageReadException e) { return false; } }
This code can detect all the corrupted images I've tested, but the performances are very poor (on my PC less than 1 image / second).
I'm looking for a Java alternative to the UNIX program jpeginfo which is roughly 10 times faster (on my PC around 10 images / second).
Solution 1:[1]
I took a look at the JPEG format, and to my understanding a final EOI
(end-of-image) segment of two bytes (FF D9
) should be last.
boolean jpegEnded(String path) throws IOException {
try (RandomAccessFile fh = new RandomAccessFile(path, "r")) {
long length = fh.length();
if (length < 10L) { // Or whatever
return false;
}
fh.seek(length - 2);
byte[] eoi = new byte[2];
fh.readFully(eoi);
return eoi[0] == -1 && eoi[1] == -39; // FF D9 (first falsely -23)
}
}
Solution 2:[2]
Probably not the best of answers, but...
The jpeginfo program you mentioned is in C. So that brings back memories of when I wanted to use code written by the Navy (That was in C++) in a Java application that I was developing.
I had two options:
- Link my java code to the C++ (C in your case) library using JNI (Java Native Interface).
- Translate the C++ library to java code.
Option 1 proved to be difficult to me as I need to pass an object into the library and get object(S) back from the library which forced me to do option 2 (Also, because of deadline scheduling).
So in you're case, because I don't know of any other libraries in Java that would meet your requirements, I would suggest these 2 options, or possibly build your own parser.
Solution 3:[3]
The only way to tell for certain if a JPEG image is corrupted is to decompress it.
You ask if there is a quick way. You could certainty trade off speed for accuracy. The simplest way would be to check to see if the stream has an SOI marker at the front and an EOI marker at the end.
Next up, you could try parsing the markers to ensure they have valid values.
Solution 4:[4]
It's not a native Java approach, but you can always shell out to a program like jpeginfo or imagemagick's identify - the overhead of a shell may be less that the time spent by the Java libraries.
I had to do something similar, and I found that I could use Runtime.exec to call identify -regard-warnings -verbose -
with stdin from a byte array, on a 2013 macbook pro in about 200ms (I'm checking mp3 artwork, so image sizes are around 300x300px). Not great, but faster than 1 image per second!
(Note for my images I had to specify -verbose
for imagemagick to pick up some errors)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | Shar1er80 |
Solution 3 | |
Solution 4 | Korny |