TokyoMonsterTrucker@lemmy.dbzer0.com to

Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ@lemmy.dbzer0.comEnglish · 1 year ago

Digital textbook extraction

26

Digital textbook extraction

TokyoMonsterTrucker@lemmy.dbzer0.com to

Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ@lemmy.dbzer0.comEnglish · 1 year ago

I have a digital subscription to a textbook, but it’s super annoying to have to use the website to access the book. I’d like to scrape the ebook and dump the contents into a pdf. I have downloaded proprietary pdfs from websites before using downloader browser plugins and predictable urls, but this site is pretty locked down, with randomly generated url tokens and a combination of xml and image data.

Has anyone managed to scrape a digital textbook like this? Any ideas where I should begin?

Chat

KevonLooney@lemm.ee
link
fedilink
English
arrow-up
2
arrow-down
1·
1 year ago
Yeah, I believe you can do that by printing to a non-existent printer and then finding the file image waiting in the print queue. I don’t know if it works on Windows 11 but it used to work pretty well.

Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ@lemmy.dbzer0.com

piracy@lemmy.dbzer0.com

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !piracy@lemmy.dbzer0.com

⚓ Dedicated to the discussion of digital piracy, including ethical problems and legal advancements.

Rules • Full Version

1. Posts must be related to the discussion of digital piracy

2. Don’t request invites, trade, sell, or self-promote

3. Don’t request or link to specific pirated titles, including DMs

4. Don’t submit low-quality posts, be entitled, or harass others

Loot, Pillage, & Plunder

💰 Please help cover server costs.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

321 users / day
2.15K users / week
5.8K users / month
12.4K users / 6 months
2 local subscribers
53.9K subscribers
2.48K Posts
40.3K Comments
Modlog