'Best way to protect sensitive information copying in HTML?

The company I work for has a requirement to protect some area where articles are rendered, I've implemented some procedures to protect web-scraping but the problem remains for manual scraping.

The anti web scraping bot protection mechanism seems to be working good so far, but I see clients trying the manual scraping.

What I have tried to protect the article contents:

  1. Set a copy eventhandler on article's wrapper element to prevent copy. -> Clients can make use of userscripts (greasemonkey, etc) to efficiently bypass this by removing the eventhandler or simply making scripts to copy the contents and save to a file
  2. Developer console protection -> useless
  3. Redirect if F12 pressed -> useless

Seems like protecting HTML is undoable (unless someone tells me otherwise) so I'd like to know other ways to display text and render it totally UNABLE to copy.

Things I've thought:

  1. JS detection mechanisms to diagnose if the user has any sort of userscript running, in other words, if there's no malicious JS code being injected and executed to extract the text
  2. Transforming the article's HTML into a PDF and displaying it inline with some sort of anti text-select/copy (if this even exists).
  3. Transforming the article's HTML into chunks of base64 images which will render the text completely unable to select and copy

Are there any good ways to prevent my content from being stolen while not interfering much with user experience? Unfortunately flash applets are not supported anymore, it used to work charms that era.

EDIT: Cmon folks, I just need ideas for at least make end user's efforts a bit harder, i.e. you can't select text if they're displayed as images, you can only select image's themselves.

Thanks!



Solution 1:[1]

As soon as you ship HTML out of your machine, whoever gets it can mangle it at leisure. You can make it harder, but not impossible.

Rethink your approach. "Give information out" and "forbid it's use" somewhat clashes...

Solution 2:[2]

No, You Can't

Once the browser loaded your page, You can't protect the content from copying / downloading.

It can be text, image or videos, You can protect it from unauthorised access. But you can't protect from get scraped by the authorized person.

But you can make it harder using the steps that you mentioned in your question and restricting the copyright laws.

This issue still exists in many sites, Especially In E-learning platforms, such as udemy and etc... In those sites, The premium courses are still getting copied / leaked by the person who bought it.

From Udemy FAQ

For a motivated Pirate, however, any content that appears on a computer screen is vulnerable to theft. This is unavoidable and a problem across the industry. Giants like Netflix, Youtube, Amazon, etc. all have the same issue, and as an industry, we continue to work on new technology solutions to limit Piracy.

Because pirating techniques currently outpace protection, we hired a company who is specifically dedicated to enforcing the DMCA laws on your behalf and target violating individuals, hosting sites, and DNS servers in an attempt to get any unauthorized content removed.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 vonbrand
Solution 2 DisappointedByUnaccountableMod