PyPDF2 is a PDF toolkit which simplifies working with PDF documents in Python. Although the library exposes a good deal of useful functionality, this post focuses on how to merge two or more PDFs into a single document using functionality avilable in the PdfFileMerger class.
PyPDF2 is not included as part of the Python Standard Library. To install it, use pip install PyPDF2, or install it manually from PyPI.

Assume a directory containing several PDFs documents is to be compiled into a single document. With PyPDF2, it’s as simple as instantiating and instance of the PdfFileMerger class and calling its append method. We demonstrate below:

from PyPDF2 import PdfFileMerger, PdfFileReader, PdfFileWriter

# directory containing files to merge =>
pdf_dir = "C:\\PDFs\\"

# generate list of files for merging =>
pdf_files = [pdf_dir + i for i in pdf_dir if i.endswith(".pdf")]

# instantiate PdfFileMerger instance =>
merger = PdfFileMerger()

for i in enumerate(pdf_files):
    print("Merging {} ({} of {})".format(i[1], i[0]+1, len(pdf_files)))
    merger.append(PdfFileReader(open(i[1], 'rb')))

# write merged PDF to file =>

# close merge instance =>

PyPDF2’s inteface is very intuitive, and in only a few lines of Python, you can create a custom PDF document merging tool that merges an arbitrary number of PDFs.

Although we haven’t explored it here, the PdfFileReader class exposes many methods for obtaining information about a PDF document, such as the number of pages, page layout, page mode, and whether the document in question contains text fields. More information of the PdfFileReader class can be found here. Until next time, happy coding!