With TIKA-605, you can now use Tika to parse geospatial file formats! To figure out how, read on.
If you're lucky this will work:
$ brew install gdal --complete
Errors encountered with brew and Mavericks
Note if you encounter errors while upgrading to Mavericks here, the answer is to first:
$ brew rm $(join <(brew leaves) <(brew deps gdal --complete ))
Note the above instructions are definitely Mac centric. We recommend checking out GDAL's Website for specific instructions on installing GDAL on your operating system.
Once GDAL is installed, the following command should be available on your path.
Running gdalinfo should produce something like:
Usage: gdalinfo [--help-general] [-mm] [-stats] [-hist] [-nogcp] [-nomd] [-norat] [-noct] [-nofl] [-checksum] [-proj4] [-listmdd] [-mdd domain|`all`]* [-sd subdataset] datasetname FAILURE: No datasource specified.
If that works you are in business!
Using Tika and GDAL
To use Tika and GDAL grab the 1.7-SNAPSHOT latest of Tika and then grab a geospatial file, e.g., this example will use a Flexible Image Transport System (FITS) file as an example. Then run:
java -jar tika-app-1.7-SNAPSHOT.jar -m WFPC2u5780205r_c0fx.fits
This should produce, e.g.,
ALLG-MAX: 3.777701E3 ALLG-MIN: -7.319537E1 ATODCORR: COMPLETE .. X-Parsed-By: org.apache.tika.parser.DefaultParser X-Parsed-By: org.apache.tika.parser.gdal.GDALParser
If you see X-Parsed-By: ..GDALParser and a bunch of geospatial metadata, you are in business!
Using Tika Server and GDAL
Once you have GDAL and a fresh build of Tika 1.7-SNAPSHOT (including Tika server), you can easily use Tika-Server with GDAL. For example, to post a FITS file to the server and get back its metadata, run the following commands:
in another window, start Tika server
java -jar /path/to/tika-server-1.7-SNAPSHOT.jar
in another window, issue a cURL request
curl -T /path/to/fits/image.fits http://localhost:9998/tika --header "Content-type: application/fits"