You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

Current state

Apache Tika include a lot of Apache and thirdparty libraries that have different approach to logging. Tika use slf4j-api as logging API and Apache Log4j 2.x as an implementation for modules that require it.

Important note

Since Tika 2.5.0 (released 2022-10-03) depends on slf4j-api 2.0.x which requires downstream library users to update logging backend to compatible version. Tika 2.0.0 – 2.4.1 depends on slf4j-api 1.7.x.

Otherwise you will receive something like following message:

SLF4J: No SLF4J providers were found.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See https://www.slf4j.org/codes.html#noProviders for further details.
SLF4J: Class path contains SLF4J bindings targeting slf4j-api versions 1.7.x or earlier.
SLF4J: Ignoring binding found at file:/home/gross/.gradle/caches/modules-2/files-2.1/org.jboss.slf4j/slf4j-jboss-logmanager/1.2.0.Final/baff8ae78011e6859e127a5cb6f16332a056fd93/slf4j-jboss-logmanager-1.2.0.Final.jar!/org/slf4j/impl/StaticLoggerBinder.class
SLF4J: See https://www.slf4j.org/codes.html#ignoredBindings for an explanation.
ERROR StatusLogger Log4j2 could not find a logging implementation. Please add log4j-core to the classpath. Using SimpleLogger to log to the console..

Updates for popular logging backends:

  • Apache Log4j 2.x: org.apache.logging.log4j:log4j-slf4j-implorg.apache.logging.log4j:log4j-slf4j2-impl
  • Logback: ch.qos.logback:logback-classic 1.2.x → 1.3.x (uses older javax.* APIs) or 1.4.x (uses jakarta.* APIs)
  • Apache Log4j 1.2.x: org.slf4j:slf4j-log4j12org.slf4j:slf4j-reload4j (though slf4j-log4j12 has relocation relocation directive to slf4j-reload4j  since 1.7.34) or migrate to the Log4j 2.x since log4j 1.2.x is in the End of Life status since 2015 and has known vulnerabilities

JBoss Logging (slf4j-jboss-logging/slf4j-jboss-logmanager) as of 2022-11-09 are still on slf4j-api 1.7.x, see https://issues.redhat.com/browse/JBLOGGING-165. Currently you can try downgrading org.slf4j:slf4j-api version to 1.7.36 if you have to use Tika with JBoss Logging (e.g. if you use Quarkus or WildFly native logging).

Tika parser modules

tika-parser-*-module artifacts depend on many Apache and thirdparty libraries. Tika itself use slf4j-api but underlying libraries use different logging API (commons-logging, java.util.logging, log4j 1.2.x, log4j 2.x, slf4j).

By default Tika will bring slf4j-api via tika-core and some bridges like org.slf4j:jcl-over-slf4j and org.slf4j:jul-to-slf4j as opinionated default. Depending on your logging backend and preferred configuration you'll need different dependency exclusions and bridges/implementations.

In you have no preference about logging backend it's enough to add org.apache.logging.log4j:log4j-core, org.apache.logging.log4j:log4j-slf4j2-impl and org.apache.logging.log4j:log4j-1.2-api (or org.slf4j:log4j-over-slf4j) and exclude log4j:log4j, commons-logging:commons-logging, ch.qos.logback:logback-classic, ch.qos.logback:logback-core, ch.qos.reload4j:reload4j and org.slf4j:slf4j-reload4j.

As of main branch (and Tika 2.6.0) all Tika source use slf4j-api as a logging API with org.apache.logging.log4j:log4j-core:2.x as the backend for applications like tika-app / tika-eval-app / tika-server.

Following sections shows how to configure different logging solutions/backends dependencies to avoid conflicts. Loggers configuration are out of scope of this document, you should look at relevant library documentation.

Example configuration for Apache Tika 2.5.0+

If you use Apache Maven dependency section in pom.xml will contain something like this:

Common sections

<!-- Merge with your properties section -->
<properties>
  <!-- components versions, feel free keep only required for your case -->
  <tika.version>2.6.0</tika.version>
  <slf4j.version>2.0.3</slf4j.version>
  <log4j2.version>2.19.0</log4j2.version>
  <logback.version>1.4.4</logback.version> <!-- 1.4.4 for Jakarta EE 9+ or 1.3.4 if you use Java EE or Jakarta EE 8 -->
</properties>

<dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>org.apache.tika</groupId>
      <artifactId>tika-bom</artifactId>
      <version>${tika.version}</version>
      <type>pom</type>
      <scope>import</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.logging.log4j</groupId>
      <artifactId>log4j-bom</artifactId>
      <version>${log4j2.version}</version>
      <type>pom</type>
      <scope>import</scope>
    </dependency>
  </dependencies>
</dependencyManagement>

<!-- Merge with your dependencies section -->
<dependencies>
  <dependency>
    <groupId>org.apache.tika</groupId>
    <artifactId>tika-parsers-standard-package</artifactId>
    <exclusions>
      <!--
        This exclusions will become obsolete at some point but better to keep it now.
        tika-parser-*-module should excludes commons-logging explicitly but upstream libraries
        may add it to their direct or transitive dependencies
      -->
      <exclusion>
        <groupId>commons-logging</groupId>
        <artifactId>commons-logging</artifactId>
      </exclusion>
      <exclusion>
        <groupId>log4j</groupId>
        <artifactId>log4j</artifactId>
      </exclusion>
      <exclusion>
        <groupId>ch.qos.logback</groupId>
        <artifactId>logback-core</artifactId>
      </exclusion>
      <exclusion>
        <groupId>org.slf4j</groupId>
        <artifactId>slf4j-reload4j</artifactId>
      </exclusion>
      <exclusion>
        <groupId>ch.qos.logback</groupId>
        <artifactId>logback-classic</artifactId>
      </exclusion>
      <exclusion>
        <groupId>ch.qos.reload4j</groupId>
        <artifactId>reload4j</artifactId>
      </exclusion>
      <exclusion>
        <groupId>ch.qos.reload4j</groupId>
        <artifactId>reload4j</artifactId>
      </exclusion>
    </exclusions>
  </dependency>

  <!--
    You may want to add these dependencies to dependencyManagement to force consistent version if you wish.
    tika-parsers have slf4j-api, jul-to-slf4j and jcl-over-slf4j as dependencies explicitly,
    so they are here primary as example.
  -->
  <dependency>
    <groupId>org.slf4j</groupId>
    <artifactId>slf4j-api</artifactId>
    <version>${slf4j.version}</version>
  </dependency>

  <dependency>
    <groupId>org.slf4j</groupId>
    <artifactId>jul-to-slf4j</artifactId>
    <version>${slf4j.version}</version>
  </dependency>
  <dependency>
    <groupId>org.slf4j</groupId>
    <artifactId>jcl-over-slf4j</artifactId>
    <version>${slf4j.version}</version>
  </dependency>
  <dependency>
    <groupId>org.slf4j</groupId>
    <artifactId>log4j-over-slf4j</artifactId>
    <version>${slf4j.version}</version>
  </dependency>
</dependencies>

Apache Log4j 2.x with slf4j bridges

<!-- Merge with your dependencies section -->
<dependencies>
<!-- logging backend: log4j 2.x -->
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<!-- version is omitted since there's org.apache.logging.log4j:log4j-bom in dependencyManagement section -->
</dependency>
 <!-- slf4j implementation that forwards to log4j 2.x --> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-slf4j2-impl</artifactId> <!-- for slf4j 1.7.x use log4j-slf4j-impl instead -->
<!-- version is omitted since there's org.apache.logging.log4j:log4j-bom in dependencyManagement section -->
  </dependency>
</dependencies>

Logback

<!-- Merge with your dependencies section -->
<dependencies>
<!-- TODO: add log4j2 -> slf4j bridge -->

<!-- slf4j implementation -->
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>${logback.version}</version>
</dependency>
</dependencies>

TO BE REWRITTEN:

Apache Log4j 2.x with slf4j bridges

<dependencies>
<!-- bridges to route jul and jcl (commons-logging) are already present, so just add log4j 1.2.x one -->
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>log4j-over-slf4j</artifactId>
<version>${slf4j.version}</version>
</dependency>

<!-- slf4j implementation to forward logs to log4j 2.x -->
 <dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-slf4j-impl</artifactId>
<version>${log4j2.version}</version>
</dependency>

<!-- logging backend: log4j 2.x -->
  <!-- this dependency declarations are optional since org.apache.logging.log4j:log4j-slf4j-impl depends on them transitively -->
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api</artifactId>
<version>${log4j2.version></version>
</dependency>
 <dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>${log4j2.version></version>
</dependency>
</dependencies>

Apache Log4j 2.x with native bridges

<dependencies>
<dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>${tika.version}</version>
<exclusions>
<!--
This exclusion will become obsolete at some point but better to keep it now.
tika-parsers usually excludes commons-logging explicitly but upstream libraries
may add it to their direct or transitive dependencies
-->
<exclusion>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
</exclusion>
<!-- We exclude slf4j bridges and replace them with native log4j 2.x ones below -->
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>jul-to-slf4j</artifactId>
</exclusion>
 <exclusion>
<groupId>org.slf4j</groupId>
<artifactId>jul-to-slf4j</artifactId>
</exclusion>
</exclusions>
</dependency>

<!-- slf4j implementation to forward logs to log4j 2.x -->
  <dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-slf4j-impl</artifactId>
<version>${log4j2.version}</version>
</dependency>

<!-- log4j 2.x bridges to forward java.util.logging, jcl/commons-logging and log4j 1.2.x to log4j 2.x -->
  <dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-jul</artifactId>
<version>${log4j2.version}</version>
</dependency>
  <dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-jcl</artifactId>
<version>${log4j2.version}</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-1.2-api</artifactId>
<version>${log4j2.version}</version>
</dependency>

<!-- logging backend: log4j 2.x -->
  <!-- this dependency declarations are optional since org.apache.logging.log4j:log4j-slf4j-impl depends on them transitively -->
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api</artifactId>
<version>${log4j2.version></version>
</dependency>
  <dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>${log4j.version></version>
</dependency>
</dependencies>

_


  • No labels