Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. sudo apt update
  2. gpg
    1. sudo apt install gnupg
  3.  java
    1. wget -qO - https://adoptopenjdk.jfrog.io/adoptopenjdk/api/gpg/key/public | sudo apt-key add -
    2. sudo apt-get install -y software-properties-common
    3. sudo add-apt-repository --yes https://adoptopenjdk.jfrog.io/adoptopenjdk/deb/
    4. sudo apt-get install adoptopenjdk-8-hotspot
    5. sudo apt-get install adoptopenjdk-11-hotspot
    6. sudo apt-get install adoptopenjdk-14-hotspot
  4. sudo apt-get install fontconfig (https://github.com/AdoptOpenJDK/openjdk-build/issues/693 via Dominik Stadler)
  5. sudo apt install ttf-dejavu (same as above)
  6. sudo apt-get install groovy
  7. sudo apt-get install maven
  8. sudo apt-get install subversion
  9. sudo apt-get install git
  10. sudo apt-get install file
  11. installed docker following: https://docs.docker.com/engine/install/ubuntu/

Datasette

On 12 November 2020, I ran tika-eval's new FileProfile on the corpus.  This includes file type detection by Tika and by 'file', digests and file sizes.

We configured the reverse proxy for /datasette:

ProxyPreserveHost On
ProxyPass /datasette http://0.0.0.0:8001
ProxyPassReverse /datasette http://0.0.0.0:8001

The .db is in /data1/publish.  cd to that directory and then:

docker run --name datasette -d -p 8001:8001 -v `pwd`:/mnt datasetteproject/datasette datasette -p 8001 -h 0.0.0.0 /mnt/file_profiles.db --config sql_time_limit_ms:120000 --config max_returned_rows:100000 --config base_url:/datasette/

HTTPD

/etc/apache2

public directories are symlinks in /usr/share/corpora


Everything below here needs to be updated for Ubuntu

...