I have a digital subscription to a textbook, but it’s super annoying to have to use the website to access the book. I’d like to scrape the ebook and dump the contents into a pdf. I have downloaded proprietary pdfs from websites before using downloader browser plugins and predictable urls, but this site is pretty locked down, with randomly generated url tokens and a combination of xml and image data.

Has anyone managed to scrape a digital textbook like this? Any ideas where I should begin?

  • KevonLooney@lemm.ee
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    1 year ago

    Yeah, I believe you can do that by printing to a non-existent printer and then finding the file image waiting in the print queue. I don’t know if it works on Windows 11 but it used to work pretty well.