Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Download test documents and extract on local server
  • Start a local web server that serves the files from that directory.
    • Enable JWT authentication on this server - thus requiring a bearer token to access the web resources
      • The point of this is to prove we can authenticate safely in this process.
  • Start the Apache Tika Grpc Server
    • Configured with tika-config XML custom tailored to our needs.
  • Provide both Java and Go clients that are capable of establishing a Grpc Client to the Apache Tika Grpc Services, stream the list of http links for the documents into the service and obtain the parsed output,
    • Show various configuration the parallel number of worker threads in play
  • The Grpc server will use TLS Mutual authentication 

Java Bi-Directional Streaming Example

...

A Java Tika Grpc Server with an HTTP fetcher is started, and a Tika Grpc Client opens a bidirectional stream and processes a bunch of files that need parsing.

https://github.com/apache/tika/blob/tika-grpc-3x-features/tika-pipes/tika-grpc/src/test/java/org/apache/tika/pipes/grpc/PipesBiDirectionalStreamingIntegrationTest.java

Go Bi-Directional Streaming Example

TODO

Build Tika Grpc on Docker

...