Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Listing all jobs


No Format

GET /job
curl -X GET -H 'Content-Type: application/json' -i http://localhost:8081/job 


Response contains list of all jobs (running and history)

...

Create job with given parameters. You should either specify Job Type(like INJECT, GENERATE, FETCH, PARSE, etc ) or jobClassName.

No Format

POST curl -X POST -H 'Content-Type: application/json' -i http://localhost:8081/job/create
 --data  {
      '{"crawlId":"crawl01",
      "type":"FETCHINJECT",
      "confId":"default",
      "args": {"someParamurl_dir":"seedFiles/seed-1641959745623", "crawldb": "crawldb"}}' 

Response object is provided below

No Format
{
  "id": "crawl01-default-INJECT-1877363907",
  "type": "INJECT",
  "confId": "default",
  "args": {
someValue"}
   }

POST /job/create
   {
      "crawlId":"crawl01",
      "jobClassNameurl_dir":"org.apache.nutch.fetcher.FetcherJob" "seedFiles/seed-1641959745623",
      "confIdcrawldb": "defaultcrawldb",
    },
  "argsresult":{"someParam null,
  "state": "someValueRUNNING"},
   }

...

No Format
"msg": "OK",
    job-id-43243
"crawlId": "crawl01"
}

Seed List creation

The /seed/create endpoint enables the user to create a seedlist and return the temporary path of the file created. This path should be passed to the url_dir parameter of the INJECT job.

...