The data.
- jaj160
- Sep 27, 2021
- 2 min read
Updated: Oct 2, 2021
Thanks to the US Copyright Act of 1976, it is not necessary to register a work for it to be protected by copyright.
Still, people do it anyway because there's a certain glam factor in it and/or because they don't know all about the Copyright Act of 1976 like some nerds or because, purportedly, registration can be helpful in the event that someone needs to enforce copyright (aka: "sue for infringement").
Pretty much, so long as a work is "a work of authorship", so long as it's original, and so long as it's "fixed in a tangible medium", it will be subject to the protections of copyright. However, as you may be able to guess by the quotes, those can be slippery criteria and not all works meet them. The Copyright Office reviews all applications for registration of copyright and, yes, it does reject some. When this happens, the applicant can appeal - twice. The final result of the final appeal is delivered to the applicant in the form of a multi-page letter detailing all of the reasons why the appeal was rejected or (rarely) approved.
Such letters comprise this dataset. Specifically, I have found my way to an online database of the US Copyright Office's Review Board Decisions and downloaded pdfs of all decisions from 2019 to today (which is September 27, 2021, by the way). I am now in the process of exporting them to XML. This takes a million years. Should you wish to play along with the data, you can download your own decision letters from the link above or access mine via the links below.
(Just kidding. As you'll note in a subsequent post, I had to adjust my approach and convert not to XML but instead to .txt. I've removed the data from this page so as to not litter up the internet with redundant data sets. You can find my data here.)

Zoe ambigram, submitted for copyright registration by Adam Zaner in 2019.
Rejected for the second and final time in January 2021.
Comments