Tika now support FFMPEG extraction. Read on below to see how.

Install FFMPEG

If you're lucky, the following should install FFMPEG.

brew install ffmpeg

You can test to see if FFMPEG is installed by typing:

ffmpeg -version

You should see something resembling:

ffmpeg version 2.3.3 Copyright (c) 2000-2014 the FFmpeg developers
built on Jan  8 2015 14:52:39 with Apple LLVM version 6.0 (clang-600.0.56) (based on LLVM 3.5svn)
configuration: --prefix=/usr/local/Cellar/ffmpeg/2.3.3 --enable-shared --enable-pthreads --enable-gpl --enable-version3 --enable-nonfree --enable-hardcoded-tables --enable-avresample --enable-vda --cc=clang --host-cflags= --host-ldflags= --enable-libx264 --enable-libfaac --enable-libmp3lame --enable-libxvid
libavutil      52. 92.100 / 52. 92.100
libavcodec     55. 69.100 / 55. 69.100
libavformat    55. 48.100 / 55. 48.100
libavdevice    55. 13.102 / 55. 13.102
libavfilter     4. 11.100 /  4. 11.100
libavresample   1.  3.  0 /  1.  3.  0
libswscale      2.  6.100 /  2.  6.100
libswresample   0. 19.100 /  0. 19.100
libpostproc    52.  3.100 / 52.  3.100

To use FFMPEG in Tika, then you simply use Tika-App and/or the Tika-REST server on a video file. Read on below to see how.

Using Tika App

You can use Tika app once FFMPEG is installed to parse a video file. For example:

java -classpath tika-app/target/tika-app-1.9-SNAPSHOT.jar org.apache.tika.cli.TikaCLI -m SPOT11_000001\ 15.AVI

Which should produce the following output:

Content-Length: 312559634
Content-Type: video/x-msvideo
X-Parsed-By: org.apache.tika.parser.DefaultParser
X-Parsed-By: org.apache.tika.parser.external.CompositeExternalParser
X-Parsed-By: org.apache.tika.parser.external.ExternalParser
encoder: ankarec
resourceName: SPOT11_000001 15.AVI
videoResolution: 720x480
xmpDM:audioChannelType: 1
xmpDM:audioCompressor: pcm_s16le ([1][0][0][0] / 0x0001)
xmpDM:audioSampleRate: 8000
xmpDM:duration: 00:05:35.92
xmpDM:fileDataRate: 7443 kb/s
xmpDM:videoColorSpace: yuvj420p(pc, bt470bg)
xmpDM:videoCompressor: mjpeg (MJPG / 0x47504A4D)
xmpDM:videoFrameRate: 25

Here is a test on another file:

java -classpath tika-app/target/tika-app-1.9-SNAPSHOT.jar org.apache.tika.cli.TikaCLI -m WOW_MR_T.avi

Which should produce the following output:

Content-Length: 8932074
Content-Type: video/x-msvideo
X-Parsed-By: org.apache.tika.parser.DefaultParser
X-Parsed-By: org.apache.tika.parser.external.CompositeExternalParser
X-Parsed-By: org.apache.tika.parser.external.ExternalParser
resourceName: WOW_MR_T.avi
xmpDM:audioCompressor: mp3 (U[0][0][0] / 0x0055)
xmpDM:audioSampleRate: 48000
xmpDM:duration: 00:00:33.03
xmpDM:fileDataRate: 2163 kb/s
xmpDM:videoCompressor: mpeg4 (DX50 / 0x30355844)

Using Tika Server

Start Tika server using the following command:

java -jar tika-server/target/tika-server-1.9-SNAPSHOT.jar

Then, issue a cURL command to post a video to Tika Server for FFMPEG to parse:

curl -T WOW_MR_T.avi -H "Content-Disposition: attachment; filename=WOW_MR_T.avi" http://localhost:9998/rmeta

Which should return:

[
   {
      "Content-Type":"video/x-msvideo",
      "X-Parsed-By":[
         "org.apache.tika.parser.DefaultParser",
         "org.apache.tika.parser.external.CompositeExternalParser",
         "org.apache.tika.parser.external.ExternalParser"
      ],
      "X-TIKA:parse_time_millis":"219",
      "resourceName":"WOW_MR_T.avi",
      "xmpDM:audioCompressor":"mp3 (U[0][0][0] / 0x0055)",
      "xmpDM:audioSampleRate":"48000",
      "xmpDM:duration":"00:00:33.03",
      "xmpDM:fileDataRate":"2163 kb/s",
      "xmpDM:videoCompressor":"mpeg4 (DX50 / 0x30355844)"
   }
]
  • No labels