This is the import.io's application running as a command-line crawler. Once provided with a relation and configuration it will
crawl a site based on the configuration provided into the relation provided.
For more information on the command-line crawler see our support page here:- http://support.import.io/knowledgebase/articles/325728
This charm sets up a machine to run the import.io application as a command line crawler. Use this charm to crawl your target sites and push the data directly into your target application.
The target application needs to be something that can take json documents posted over http. Currently the only application support is the elasticsearch application that just works(tm).
For more details of the import.io command-line crawler functionality please read:-
Command-Line Crawler Instructions
For more details of the import.io command-line crawler settings please read:-
Deploy the charm by doing this:
juju deploy importio
Currently you need elasticsearch also running
juju deploy elasticsearch
juju add-relation importio elasticsearch
Currently the only target we stream json documents into is elasticsearch, in theory other data stores would work as well.
The configuration does not ship with defaults for most settings. Easiest way is to:-
juju set importio --config /path/to/config.yaml
with a yaml file like so:-
If you have any problems with this charm, ideas or improvements please contact us at:- firstname.lastname@example.org or http://support.import.io/