| Building BlocksThese are the basic nouns of scrapeR. A spider is made of a series of steps that are run sequentially on each item in its queue. A pipeline can contain a series of generic steps to be reused at the end of each scrapeR. A runner can be used to run multiple spiders at once. | |
|---|---|
| Spider for crawling urls | |
| Collection of generic steps to append to spider | |
| runner | |
| Steps for Spiders and Pipelines | |
| Helpers | |
| Add items to a spider queue | |
| Add steps to a pipeline or spider. | |
| Run a spider or a runner | |
| Rename a spider | |
| pipeline sets the pipeline used by the given spider | |
| Prebuilt StepsThese build in steps can be added to different to a spider or pipeline using the add_step function. | |
| bind rows | |
| clean names | |
| write results to aws s3 | |
| save output | |
| read html | |