...
113707915
Background
So, you've integrated Apache Tika into your framework, tried it on a couple of thousand files and all works well. Problem solved!
...
- Regular catchable exceptions
- 2. OutOfMemory errors which can put the jvm in an unreliable state
- 3. Permanent hangs (Tika can chew up massive amounts of resources and go forever)
- 4. Security vulnerabilities (e.g. CVE-2016-6809 and CVE-2016-4434)
Please note that for 3., permanent hangs – you cannot terminate the Thread. Thread's stop, suspend, destroy sound like they'll do the trick, but they won't. You need to kill the entire process. See TIKA-456.
As of Tika 1.15, we added a MockParser 113707915 in the tika-core-tests.jar that will allow you to test your framework against items 1-3. Simply add that jar to your class path and then include a <mock> xml file in your set of test documents, and crash, crash away.
...
<throw class="my.evil.DeserializationAttack">bwahahaha</throw>
Usage
Below are several options for adding the dependency.
Including the tika-core-tests dependency in your project
<dependency>
<groupId>${project.groupId}</groupId>
<artifactId>tika-core</artifactId>
<version>${project.version}</version>
<type>test-jar</type>
<scope>test</scope>
</dependency>
Tika-app
Place the tika-app.jar and the tika-core-tests.jar in a "bin" directory.
...