{"id":4394,"date":"2025-04-15T14:51:03","date_gmt":"2025-04-15T12:51:03","guid":{"rendered":"https:\/\/booksfactory.pl\/blog?p=4394"},"modified":"2025-04-29T13:18:04","modified_gmt":"2025-04-29T11:18:04","slug":"ai-feeds-on-pirated-books-meta-and-the-authors-backlash","status":"publish","type":"post","link":"https:\/\/booksfactory.pl\/blog\/ai-feeds-on-pirated-books-meta-and-the-authors-backlash\/?lang=en","title":{"rendered":"AI Feeds on Pirated Books. Meta and the Authors&#8217; Backlash"},"content":{"rendered":"\n<figure class=\"wp-block-gallery has-nested-images columns-default is-cropped wp-block-gallery-1 is-layout-flex wp-block-gallery-is-layout-flex\">\n<figure class=\"wp-block-image size-large\"><img fetchpriority=\"high\" decoding=\"async\" width=\"1024\" height=\"768\" data-id=\"3787\" src=\"https:\/\/booksfactory.pl\/blog\/wp-content\/uploads\/2025\/04\/25_Moze-Zuckerberg-_1200x900-1024x768.jpg\" alt=\"Mark Zuckerberg as a thief.\" class=\"wp-image-3787\" srcset=\"https:\/\/booksfactory.pl\/blog\/wp-content\/uploads\/2025\/04\/25_Moze-Zuckerberg-_1200x900-1024x768.jpg 1024w, https:\/\/booksfactory.pl\/blog\/wp-content\/uploads\/2025\/04\/25_Moze-Zuckerberg-_1200x900-300x225.jpg 300w, https:\/\/booksfactory.pl\/blog\/wp-content\/uploads\/2025\/04\/25_Moze-Zuckerberg-_1200x900-768x576.jpg 768w, https:\/\/booksfactory.pl\/blog\/wp-content\/uploads\/2025\/04\/25_Moze-Zuckerberg-_1200x900.jpg 1200w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/figure>\n\n\n\n<div style=\"height:30px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Meta, the tech giant founded by Mark Zuckerberg, trained its latest artificial intelligence model, LLaMA, using a massive dataset of books sourced from the notorious piracy website Library Genesis (LibGen).<br><br>LibGen is an illegal repository hosting millions of publications, ranging from literary classics and academic works to contemporary bestsellers. For Meta, it served as a goldmine of training data. However, none of this involved the consent of authors or publishers. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\" style=\"font-size:24px\">Zuckerberg: It&#8217;s Legal<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Internal Meta documents reveal that Mark Zuckerberg approved using LibGen-sourced content for AI training. He argued that the move falls within the bounds of U.S. law\u2014specifically under the &#8222;fair use&#8221; doctrine, which permits specific uses of copyrighted material for research or developmental purposes.    <br><br>This interpretation is highly controversial and legally dubious\u2014especially given the commercial intent and the sheer volume of work involved.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" style=\"font-size:24px\">Legal Sources? Too Expensive! <\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The documents also show that Meta considered purchasing licenses from publishers, but staff reportedly deemed the proposed terms &#8222;unreasonably expensive.&#8221; Ultimately, they opted for the pirated material, not due to a lack of alternatives, but because it was the more cost-effective route. <br><br>It was not a necessity\u2014it was a business decision.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" style=\"font-size:24px\">Was Your Book Used to Train AI? The Atlantic Can Tell You <\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Although Meta has not released a complete list of the titles used to train its models, many authors and researchers <a href=\"https:\/\/www.theatlantic.com\/technology\/archive\/2025\/03\/search-libgen-data-set\/682094\/?fbclid=IwY2xjawJgnEZleHRuA2FlbQIxMAABHspXL-u7JKzOrw64B0v6S9xdCtiXSszLT3fo6KvGnUYdX56QnvLLKHU19fLR_aem_zEyrL32De7KYZEUCdcxICw\" data-type=\"link\" data-id=\"https:\/\/www.theatlantic.com\/technology\/archive\/2025\/03\/search-libgen-data-set\/682094\/?fbclid=IwY2xjawJgnEZleHRuA2FlbQIxMAABHspXL-u7JKzOrw64B0v6S9xdCtiXSszLT3fo6KvGnUYdX56QnvLLKHU19fLR_aem_zEyrL32De7KYZEUCdcxICw\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>are turning to The Atlantic&#8217;s searchable dataset<\/strong><\/a>\u2014a resource based on pirated archives like LibGen.<br><br>The dataset includes about 190,000 titles and allows users to check whether their books were used to train AI language models by Meta and other major players like OpenAI and Anthropic.   <\/p>\n\n\n\n<h2 class=\"wp-block-heading\" style=\"font-size:24px\">Legal Battles Underway. Europe Is Responding\u2014Is the English-Speaking World? <\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">In the U.S., lawsuits have already been filed over the unauthorized use of books in AI training. Notable plaintiffs include comedian Sarah Silverman and authors Paul Tremblay and Michael Chabon. These cases touch on copyright infringement and the lack of transparency from companies like OpenAI and Meta.   <br><br>France is also preparing collective legal actions to protect the rights of authors and publishers from exploitation by Big Tech. <br><br>In contrast, the response from institutions in English-speaking countries like the UK, Canada, or Australia has been muted. Organizations like the UK Society of Authors or the Authors Guild in the U.S. have voiced concern. Still, legislative or governmental action has been minimal\u2014despite evidence that works by prominent authors such as Margaret Atwood, George R. R. Martin, and Colson Whitehead appear in the datasets.   <\/p>\n\n\n\n<h2 class=\"wp-block-heading\" style=\"font-size:24px\">Time for New Rules?<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The unauthorized use of creative work to train artificial intelligence systems has become widespread. As a result, creative communities around the world are calling for new legal safeguards, including: <\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li style=\"margin-right:var(--wp--preset--spacing--50);margin-left:var(--wp--preset--spacing--50)\">full transparency about AI training data sources,<\/li>\n\n\n\n<li style=\"margin-right:var(--wp--preset--spacing--50);margin-left:var(--wp--preset--spacing--50)\">the ability for authors to opt-out,<\/li>\n\n\n\n<li style=\"margin-right:var(--wp--preset--spacing--50);margin-left:var(--wp--preset--spacing--50)\">and reasonable compensation for the use of their work.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Books may soon be treated as free raw material without decisive regulation, and copyright law is rendered obsolete.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Sources:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li style=\"margin-right:var(--wp--preset--spacing--50);margin-left:var(--wp--preset--spacing--50)\"><a href=\"https:\/\/www.theatlantic.com\/technology\/archive\/2025\/03\/libgen-meta-openai\/682093\/\" data-type=\"link\" data-id=\"https:\/\/www.theatlantic.com\/technology\/archive\/2025\/03\/libgen-meta-openai\/682093\/\" target=\"_blank\" rel=\"noreferrer noopener\">The Atlantic<\/a><\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Meta has trained its AI using pirated books from LibGen. Authors are suing, and the industry is demanding transparency. Check if your books were used!   <\/p>\n","protected":false},"author":2,"featured_media":3788,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"set","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[67],"tags":[],"class_list":["post-4394","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-industry-news"],"rttpg_featured_image_url":{"full":["https:\/\/booksfactory.pl\/blog\/wp-content\/uploads\/2025\/04\/25_Moze-Zuckerberg-_1200x900.jpg",1200,900,false],"landscape":["https:\/\/booksfactory.pl\/blog\/wp-content\/uploads\/2025\/04\/25_Moze-Zuckerberg-_1200x900.jpg",1200,900,false],"portraits":["https:\/\/booksfactory.pl\/blog\/wp-content\/uploads\/2025\/04\/25_Moze-Zuckerberg-_1200x900.jpg",1200,900,false],"thumbnail":["https:\/\/booksfactory.pl\/blog\/wp-content\/uploads\/2025\/04\/25_Moze-Zuckerberg-_1200x900-150x150.jpg",150,150,true],"medium":["https:\/\/booksfactory.pl\/blog\/wp-content\/uploads\/2025\/04\/25_Moze-Zuckerberg-_1200x900-300x225.jpg",300,225,true],"large":["https:\/\/booksfactory.pl\/blog\/wp-content\/uploads\/2025\/04\/25_Moze-Zuckerberg-_1200x900-1024x768.jpg",1024,768,true],"1536x1536":["https:\/\/booksfactory.pl\/blog\/wp-content\/uploads\/2025\/04\/25_Moze-Zuckerberg-_1200x900.jpg",1200,900,false],"2048x2048":["https:\/\/booksfactory.pl\/blog\/wp-content\/uploads\/2025\/04\/25_Moze-Zuckerberg-_1200x900.jpg",1200,900,false]},"rttpg_author":{"display_name":"Gabriel Augustyn","author_link":"https:\/\/booksfactory.pl\/blog\/author\/gaugustyn\/"},"rttpg_comment":0,"rttpg_category":"<a href=\"https:\/\/booksfactory.pl\/blog\/category\/industry-news\/?lang=en\" rel=\"category tag\">Industry News<\/a>","rttpg_excerpt":"Meta has trained its AI using pirated books from LibGen. Authors are suing, and the industry is demanding transparency. Check if your books were used!","_links":{"self":[{"href":"https:\/\/booksfactory.pl\/blog\/wp-json\/wp\/v2\/posts\/4394","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/booksfactory.pl\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/booksfactory.pl\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/booksfactory.pl\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/booksfactory.pl\/blog\/wp-json\/wp\/v2\/comments?post=4394"}],"version-history":[{"count":2,"href":"https:\/\/booksfactory.pl\/blog\/wp-json\/wp\/v2\/posts\/4394\/revisions"}],"predecessor-version":[{"id":4446,"href":"https:\/\/booksfactory.pl\/blog\/wp-json\/wp\/v2\/posts\/4394\/revisions\/4446"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/booksfactory.pl\/blog\/wp-json\/wp\/v2\/media\/3788"}],"wp:attachment":[{"href":"https:\/\/booksfactory.pl\/blog\/wp-json\/wp\/v2\/media?parent=4394"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/booksfactory.pl\/blog\/wp-json\/wp\/v2\/categories?post=4394"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/booksfactory.pl\/blog\/wp-json\/wp\/v2\/tags?post=4394"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}