Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Table of Contents

The GrobidJournalParser uses the GROBID (or Grobid) GeneRation Of Bibliographic Data machine learning framework to parse PDF documents and to extract structured informations such as title, abstract, authors, affiliations, keywords, etc, from journal publications. The parser has been integrated into Tika. You can follow this guide to get it working on your system.


Table of Contents

Installing GROBID

The best approach is to run Grobid via docker.

...