(no subject)
Dec. 2nd, 2022 06:28 pmvia https://ift.tt/v2f8BU7
brightwanderer https://brightwanderer.tumblr.com/post/702534495098847232/some-additional-comments-about-sudowrite :
Some additional comments about Sudowrite, hopefully of use to people.
There’s not a lot of point in locking your works. The AI has been trained. It’s done. Not only that, but the data itself remains archived and can be reused.
The algorithm in question is a bigger project called GPT-3, which gets its training data from a resource called Common Crawl. Common Crawl basically trawls everything on the web and is how search engines (including Google) index content. So it’s probably not that the people behind it have intentionally targeted AO3, though I am fascinated by the high specificity of fandom content in my results.
That’s going to make things difficult for AO3 staff to prevent, though, because as mentioned, if you make it so Common Crawl can’t look at the site at all, pages will no longer show up in searches at all.
This is understandably creepy as all hell, but my personal opinion is that locking my fanworks will do me more harm than leaving them open.
Action is needed to address the root problem of these AI training datasets in all fields, but I don’t think focusing on the ethical or artistic arguments is the way to change things. I think we should be leaning hard on IP and copyright law here. You can’t copyright a sentence or a phrase, but once you get enough individual factors together (character names, similarity of narrative, concepts) it starts to get dicey. So for example, if Sudowrite always gives a character named Harry glasses and black hair… well we all know why that is but it’s unlikely to be actionable.
If Sudowrite always associates a character named Harry with characters named Hermione, Neville, and Sirius, and consistently produces concepts like wands, potions, schools, etc… it starts to look like the kind of problem IP holders may take an interest in. Sudowrite claims that all its text is original and not directly lifted from anywhere - which may be technically true - but ironically, copyright and IP law is based on the whole of a work and its context… and if the AI can’t help but imitate those works and contexts, things are going to get interesting.
- While Elon Musk is eminently blameable for all things, he doesn’t have any personal involvement here as far as I know, these people have just got some of his money. (Your picture was not posted)