Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I haven’t seen any accusations that they’ve done that, though. Usually people get pirated material from sources that intentionally share pirated material.


They're not just training on pirated content, they've also scraped literally the entire internet and used that too.


Scraping the public internet is also not a CFAA violation


CFAA bans accessing a protected computer without authorization. Hitting URLs denied by robots.txt has been argued to be just that.


> Hitting URLs denied by robots.txt has been argued to be just that.

"Has been argued" -- sure, but never successfully; in fact, in HiQ v. LinkedIn, the 9th Circuit ruled (twice, both before and on remand again after and applying the Supreme Court ruling in Van Buren v. US) against a cease and desist on top of robots.txt to stop accessing data on a public website constituting "without authorization" under the CFAA.


Now do every other jurisdiction


CFAA was mentioned specifically, which means only US jurisdiction is relevant here.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: