{ "@context": "https://schema.org", "@type": "BlogPosting", "headline": "Spotify's Music Catalogue Was Scraped. What does it Mean?", "description": "Spotify confirms a massive catalogue scrape. What was taken, who Anna’s Archive is, and why it matters for artists and streaming.", "image": ["https://cdn.prod.website-files.com/624dd6511f2077f0f91d64df/6952d211d2783e88c73c89f4_spotify-1360002_1280.jpg], "url": "https://www.unchainedmusic.io/blog-posts/spotify-music-catalogue-was-scraped-what-does-it-mean", "author": { "@type": "Person", "name": "Matt Waters" }, "publisher": { "@type": "Organization", "name": "Unchained Music", "logo": { "@type": "ImageObject", "url": "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQF_bKsKBqe_x0l3ntAu9Gm3ZDOR95Ws-tkvQ&s" } }, "datePublished": "Dec 29, 2025", "dateModified": "Dec 29, 2025", "mainEntityOfPage": { "@type": "WebPage", "@id": "https://www.unchainedmusic.io" }, "inLanguage": "en-US", "isAccessibleForFree": true }

Spotify's Music Catalogue Was Scraped. What does it Mean?

Production & Music Industry
Updated on
December 29, 2025
Written by
Matt Waters
5 minutes
ARTICLE OVERVIEW
Spotify confirms a massive catalogue scrape. What was taken, who Anna’s Archive is, and why it matters for artists and streaming.

The image depicts a digital representation of a "Spotify scraped piracy event," highlighting the ongoing issues of music piracy within the streaming service industry. It illustrates the tension between users seeking access to music files for free and the copyright violations impacting artists and the music industry.

In December 2025, Spotify confirmed that a massive portion of its music library had been scraped by the activist “shadow library” group Anna’s Archive. The group claims to have extracted approximately 300 terabytes of data, including metadata for nearly the entire Spotify catalog and tens of millions of audio files. A Spotify representative acknowledged the scraping activity and stated "Spotify has identified and disabled the nefarious user accounts that engaged in unlawful scraping."

This incident has raised major questions about digital preservation, copyright, platform security, and the future of streaming era music access, highlighting critical issues related to Spotify piracy and the challenges facing the global music industry.

What Was Scraped in the Spotify Incident?

According to multiple reports:

  • Anna’s Archive claims to have scraped ~99.6% of songs that receive listens on Spotify, totalling around 86 million audio files.
  • The group also extracted metadata for 99.9% of Spotify’s 256 million tracks, providing a virtually complete snapshot of the service's music catalog.
  • The total dataset is estimated at 300 TB, making it one of the largest music data extractions ever recorded.
  • The scraping involved public metadata and some audio files.
  • The accounts involved were identified and disabled, and new safeguards were implemented to prevent future unauthorized access.

What Is Anna’s Archive?

Anna’s Archive is a well‑known “shadow library” project that typically focuses on books, academic papers, and cultural preservation. In a blog post, the group framed the Spotify scrape as an attempt to create the first open music preservation archive, arguing that streaming platforms do not guarantee long‑term access to music, and that preserving culture in digital form is essential.

TechCrunch and other outlets reported that the group intends to release the scraped data, beginning with metadata, through torrent distribution, raising concerns about the integration of such data into the wider internet ecosystem.

Comparison with Other Music Archives

The Spotify scrape, with its hundreds of millions of tracks and metadata entries, surpasses previous large scale music archives like MusicBrainz (around 5 million unique tracks) and private trackers like What.CD and Redacted.sh. However, the quality of audio files in the scrape (mostly 160 kbps OGG) is lower than lossless archives maintained by some private communities.

A smart phone showing the user interface of the Spotify music streaming service

Why it Matters for Artists and the Music Industry

Copyright & Licensing Risks

The scraping of audio files raises serious copyright concerns. Even if framed as preservation, distributing copyrighted music without permission is illegal in most jurisdictions and can lead to significant legal consequences.

Platform Security

Spotify stated that its internal systems were not breached, but the scale of the scrape highlights vulnerabilities in how public-facing APIs and user accounts can be exploited, especially when large portions of data can be accessed through such methods.

Preservation vs. Piracy Debate

Anna’s Archive argues that streaming platforms are not permanent repositories and that music preservation is a cultural necessity. Critics argue that this justification does not override copyright law and the rights of artists.

Impact on Streaming Ecosystems

If the scraped audio files are widely distributed, it could:

  • Undermine streaming revenue and the premium features that support artists.
  • Increase piracy across multiple sites and platforms.
  • Pressure streaming services to decide on stricter access controls and technology updates.

The Scale and Technical Aspects of the Scrape

The scrape reportedly involved the use of thousands of accounts to systematically access and download metadata and audio tracks over time. The total data volume of approximately 300 terabytes includes:

Metadata: Detailed information about tracks, albums, artists, genres, release dates, and more, totaling around 256 million entries.

Audio files: Approximately 86 million tracks, mostly at 160 kbps OGG Vorbis quality, which is Spotify's standard for streaming to free accounts and some lower-quality streams.

The scraping process likely involved circumventing rate limits and access controls by distributing requests across multiple user accounts and IP addresses, mimicking normal user behavior to avoid detection.

What Could Happen with the Scraped Data?

The leaked data has significant implications for various stakeholders:

Researchers and AI Developers: The vast metadata and audio files provide a rich dataset for music classification, recommendation algorithms, and generative AI models. The data can help improve music discovery tools and enable new music generation techniques.

Archivists and Preservationists: The scrape offers a snapshot of the current music catalog, which could be invaluable for preserving cultural heritage in a digital age where content availability is increasingly fragmented and ephemeral.

Pirates and Unauthorized Distributors: The data can be used to create unauthorized music libraries, enabling free access to music without paying for subscriptions or purchases.

Streaming Competitors: The dataset could facilitate the creation of alternative streaming services or personal media servers that mimic Spotify's catalog without licensing agreements.

A wooden hammer usually in court for cases such as copyright law, piracy cases and other court orders.

Legal and Ethical Implications

The unauthorized scraping and distribution of copyrighted music raise complex legal and ethical issues:

Copyright Infringement: Distributing copyrighted audio files without permission violates copyright laws in most countries and can lead to civil and criminal penalties.

Artist Compensation: Music piracy undermines the revenue streams of artists, record labels, and other rights holders, potentially impacting the sustainability of the music industry.

Digital Preservation vs. Piracy: While preservation of cultural artifacts is important, unauthorized copying and distribution challenge existing legal frameworks and raise questions about balancing access with rights protection.

Platform Liability: The incident highlights the challenges streaming services face in securing their content and the potential consequences of security lapses.

Spotify's Response and Industry Impact

Spotify has taken steps to:

  • Identify and disable user accounts involved in the scraping.
  • Implement enhanced safeguards against scraping and unauthorized access.
  • Collaborate with industry partners to protect artists' rights and combat piracy.
  • The incident has sparked discussions about the fragility of streaming platforms as custodians of music and the risks of relying solely on licensed, centralized services for access to cultural content.

Future of Music Streaming and Ownership

  • The incident has fueled debates about the future of music consumption, including:
  • The shift from ownership to access and its implications for users and artists.
  • The potential for decentralized or personal media servers using scraped or pirated content.
  • The need for improved digital preservation strategies that balance legal access with cultural heritage preservation.
  • The impact of AI-generated music and how it may reshape the industry.

Background on Spotify and Music Streaming

Spotify, launched in 2008, revolutionized the music industry by introducing a streaming service that offered access to millions of songs for a monthly subscription fee. Its model shifted the industry from ownership of physical media or digital downloads to access-based consumption. Spotify's catalog has grown to include hundreds of millions of tracks, including music, podcasts, and other audio content.

The platform offers both free and premium tiers. Free users listen with ads and limited features, while premium subscribers enjoy ad-free listening, offline downloads, and higher audio quality (up to 320 kbps). Spotify's success has influenced the rise of other streaming services like Apple Music, YouTube Music, Amazon Music, and Tidal, shaping the culture of music consumption worldwide.

Spotify's DRM and Anti Piracy Measures

Spotify employs Digital Rights Management (DRM) to protect its music files from unauthorized copying and distribution. DRM restricts how users can access and share content, aiming to prevent piracy and unauthorized downloads. The company also monitors for unusual account activity and employs anti-scraping measures to protect its catalog.

Despite these efforts, the recent scraping incident revealed vulnerabilities in Spotify's system that allowed a third party to extract massive amounts of data, including audio files and metadata.

The Rise of Music Piracy in the Streaming Era

Although streaming services initially contributed to a decline in music piracy by offering affordable and convenient access, recent trends indicate a resurgence of piracy driven by:

- Rising subscription costs.

- Fragmentation of content across multiple platforms.

- Geographic restrictions and licensing limitations.

- Desire for ownership and offline access without restrictions.

- The Spotify scrape exemplifies these tensions, as users seek alternatives to subscription-based models that limit control over their music libraries.

The Role of AI and Metadata in Music

The extensive metadata scraped from Spotify includes information on track features such as tempo, key, danceability, and popularity. This data is valuable for:

- Enhancing music recommendation engines.

- Training AI models for music generation and classification.

- Supporting academic research in musicology and data science.

- However, unauthorized use of this data raises questions about consent and compensation for artists whose work is used to train AI systems.

Conclusion

The Spotify scraping incident by Anna’s Archive represents a watershed moment in the ongoing tension between digital preservation, copyright enforcement, and user desires for ownership and unrestricted access. While the effort to archive and preserve music reflects legitimate cultural concerns, it also exposes vulnerabilities in streaming platforms and challenges existing legal frameworks. The music industry, artists, streaming services, and consumers must navigate these complex issues as the digital music landscape continues to evolve rapidly.

About Unchained Music

Unchained Music is a music distribution and artist & label services platform that empowers independent musicians and labels to distribute their music globally across 200+ streaming platforms. Offering tools such as AI mixing & mastering, marketing & playlist pitching, royalty advances, catalogue monetisation, and on chain payout infrastructure, Unchained Music supports artists at every stage of their careers. With a core tier providing free distribution and 100% royalty retention, alongside premium tiers featuring enhanced services, Unchained Music is committed to fostering artist independence and fair compensation in the evolving digital music ecosystem.

Ready to distribute your next song to 200+ streaming platforms?

Additional Resources for Readers

Verified External Sources

  • Beyond Machines Spotify confirms 300 TB scrape by Anna’s Archive
  • Morning Brew  86M songs scraped, metadata for 256M tracks
  • Digital Music News Spotify disables accounts involved in scraping
  • Noise11  Spotify confirms metadata scrape, stresses no internal breach
  • TechCrunch  Anna’s Archive claims 86M audio files scraped

UNCHAINED ACADEMY

THROUGH THE NOISE

MUSIC PRODUCTION

December 29, 2025
Spotify's Music Catalogue Was Scraped. What does it Mean?
December 23, 2025
Spotify Fan Support: Direct Support for Independent Artists
December 23, 2025
Spotify Ticketing & Live Events

WEB3, NFT'S & MUSIC

November 9, 2024
The Top OpenSea Music NFT Collections
October 26, 2024
TikTok Viewership: User Demographics, Age, and More
October 26, 2024
Timbaland Partners with Suno AI to Launch Remix Contest

UNCHAINED TIPS

December 9, 2025
How Independent Artists Can Book Their Own Shows and Build Momentum with Booking-Agent.io
December 9, 2025
Best DIY Touring Tools for Independent Artists (2026)
December 9, 2025
"Discover if Booking-Agent.io is worth it. Explore features, pros, cons, alternatives, and book venues with ease for unforgettable events."
© 2025 Unchained Music. All rights reserved.