.. | ||
README.md | ||
wrapper.go |
crawlerWrapper
Run
run it with
$GOPATH/bin/crawlerWrapper wc
and it will count characters of your requests. Wow.
Job submission
Submit with
curl localhost:8123/submit -X POST -i -d '{"requestid": "026bff12-66c9-4b02-868c-cb3bbee1c08f", "Job": {"data": "foobar"}}'
yes, the requestid MUST be a valid UUIDv4. Well, you can omit it, but you shouldn't. The Job
field must be a
a JSON object. Strings, arrays, numbers will not work fine. It must be an object, but it can be any
object.
This basically means that your worker should probably understand JSON.
If you chose wc
as worker, it will count the bytes for the literal string {"data": "foobar"}
without doing
any JSON parsing.
Worker output
You might expect that stdout coming from the worker is handled in some way. That's not exact. A worker is supposed to deal about the rest of the chain by itself, so no, the output is not automatically fed to some other worker or anything else.
However, there is at least a way to have a notification of completed job. This is done adding two fields to the job submission.
curl localhost:8123/submit -X POST -i -d '{"requestid": "'$(python3 -c 'import uuid; print(uuid.uuid4());')'", "ResponseRequested": true, "PingbackURL": "http://google.it/"}'
Here we added two fields; ResponseRequested
is a boolean specifying if we care about knowing that the
command has completed. If unspecified, it is false. If it is true, then a POST
will be made to the URL
specified in PingbackURL
. The format of this POST is specified in crawler.JobCompletionNotice
. Please
notice that the POST
will be done even on worker error.