floey
I created a scraper/crawler for the site and here is the sample output in JSON format for my local crag, The Rumbling Bald.

{"Rumbling Bald":{"styles":{"Boulder":"32%","Unknown":"20%","Sport":"2%","Trad":"44%"},"breadcrumbs":["Rumbling Bald","North Carolina","USA","North America"],"areas":[{"routes":"9","ticks":"129","height":"130ft","name":"Cereal Buttress"},{"height":"","name":"Comotose Area","routes":"0","ticks":"0"},{"routes":"0","ticks":"0","name":"Flakeview Area","height":""},{"height":"","name":"Lakeview Area","routes":"0","ticks":"0"},{"height":"69ft","name":"Screamweaver Area","routes":"3","ticks":"6"},{"ticks":"0","routes":"0","height":"","name":"Cereal Wall"}],"type":"crag","url":"http://www.thecrag.com/climbing/united-s...","number of routes":"34"}}

The scraper could scrape a larger area, but I did 1 crag so we would just have a little data first.
posted by floey 3 years and 27 days ago Link

brendanheywood
Is this running from Quebec in Safari 534.34 by any chance and still running right now?

We've seen a massive spike in traffic hitting urls like this:

http://www.thecrag.com/climbing/a/b/area...
http://www.thecrag.com/climbing/a/b/area...
http://www.thecrag.com/climbing/a/b/area...

This is quite odd as that url format, while it works, isn't something you'd ever find by following links, it works almost by accident.

If it is please let me know so I don't have to keep chasing this traffic, and please ramp it down a *lot*. Ideally just scrape the page without any assets (ie don't load our google analytics beacon)
posted by brendanheywood 3 years and 26 days ago Link
floey
That is quite odd. I haven't been running anything since I posted and my traffic was minimal and confined to that single url. I don't live in Quebec or have safari installed either. I do hope you figure out what the problem is though.
posted by floey 3 years and 26 days ago Link
jdorw
Great thanks! All of the data that you want to show will have to go into a single "paragraph" field. This field will have to be formatted in the way you want it to show on the site. For now that can only be plain text and newlines.

Feel free to make a pull request at any time. It doesn't have to be completely finished and it might be easier to go over the output file format on github.
posted by jdorw Staff3 years and 25 days ago Link
floey
Pull request has been made:
https://github.com/duckduckgo/zeroclicki...
posted by floey 3 years and 25 days ago Link