Sometimes businesses or contacts will send you a PDF containing your personal data. In order to protect your privacy, the PDF is password protected, often with some trivial password such as your birthday. Still, there are so many ways that can be arranged that you are bound to not know when you come across the file in 3 years time. Enter: PDF unprotection.

Install qpdf with your favorite package manager, and you can quickly create an unprotected version of your PDF:

qpdf --decrypt --password=PASS input.pdf output.pdf

Interestingly, a PDF can also be "protected" without a password. In this case, the PDF is resistant to editing and possibly even to printing. Under the hood, the same password protection mechanisms are being used, just with an empty password. You can unprotect as above, but you don't need to specify the password:

qpdf --decrypt input.pdf output.pdf

So how do you now find all the PDFs you've collected over the last 10 years that are password protected? This should do the trick:

fd -e pdf -x bash -c '(pdfinfo "{}" 2>&1 >/dev/null | grep -q "Incorrect password") && echo "{}"'

Let's break-down the command:

  • The command fd searches for files.
  • The argument -e pdf limits matches to those with the .pdf extension.
  • The argument -x ... runs the ... command on each file, replacing {} with the concrete filename

Now let's look at what we run for each PDF:

  • The command bash -c '...' runs the rest of the command as a bash snippet.
  • The command pdfinfo prints info about PDFs (but we are only interested in error messages).
  • The syntax 2>&1 >/dev/null | throws away stdout and pipes stderr to what follows.
  • The command grep -q "Incorrect password" returns an error code unless this string is found.
  • The final (...) && echo "{}" prints the filename if the previous segment is successful.

And now you know which PDFs you still need to unprotect.