...
Listing all jobs
No Format |
---|
GET /job curl -X GET -H 'Content-Type: application/json' -i http://localhost:8081/job |
Response contains list of all jobs (running and history)
...
Create job with given parameters. You should either specify Job Type(like INJECT, GENERATE, FETCH, PARSE, etc ) or jobClassName.
No Format |
---|
POST curl -X POST -H 'Content-Type: application/json' -i http://localhost:8081/job/create --data { '{"crawlId":"crawl01", "type":"FETCHINJECT", "confId":"default", "args": {"someParamurl_dir":"seedFiles/seed-1641959745623", "crawldb": "crawldb"}}' |
Response object is provided below
No Format |
---|
{ "id": "crawl01-default-INJECT-1877363907", "type": "INJECT", "confId": "default", "args": { someValue"} } POST /job/create { "crawlId":"crawl01", "jobClassNameurl_dir":"org.apache.nutch.fetcher.FetcherJob" "seedFiles/seed-1641959745623", "confIdcrawldb": "defaultcrawldb", }, "argsresult":{"someParam null, "state": "someValueRUNNING"}, } |
...
No Format |
---|
"msg": "OK", job-id-43243 "crawlId": "crawl01" } |
Seed List creation
The /seed/create endpoint enables the user to create a seedlist and return the temporary path of the file created. This path should be passed to the url_dir parameter of the INJECT job.
...