'PyPDF2 PdfFileWriter has no attribute stream

I am trying to split a pdf into its pages and save each page as a new pdf. I have tried this method from a previous question with no success and the pypdf2 split example from here with no success. EDIT: I can see in my files that it does successfully write the first page, the second page pdf is then created but is empty.

Here is the code I am trying to run:

from PyPDF2 import PdfFileWriter, PdfFileReader

inputpdf = PdfFileReader(open("my_pdf.pdf", "rb"))

for i in range(inputpdf.numPages):
    output = PdfFileWriter()
    output.addPage(inputpdf.getPage(i))
    with open("document-page%s.pdf" % i, "wb") as outputStream:
        output.write(outputStream)

Here is the full error message:

Traceback (most recent call last):
  File "pdf_functions.py", line 9, in <module>
    output.write(outputStream)
  File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 482, in write
    self._sweepIndirectReferences(externalReferenceMap, self._root)
  File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 572, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 548, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 572, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 548, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 557, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, data[i])
  File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 572, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 548, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 575, in _sweepIndirectReferences
    if data.pdf.stream.closed:
AttributeError: 'PdfFileWriter' object has no attribute 'stream'

I also tried this and confirmed that I can indeed extract a single page.

from PyPDF2 import PdfFileWriter, PdfFileReader
inputpdf = PdfFileReader(open("/home/ubuntu/inputs/cityshape/form5.pdf", "rb"))

#for i in range(inputpdf.numPages):
output = PdfFileWriter()
output.addPage(inputpdf.getPage(2))
with open("document-page2.pdf", "wb") as outputStream:
    output.write(outputStream)


Solution 1:[1]

The same thing happened to me.

I was able to solve it by moving the following line inside the loop:

inputpdf = PdfFileReader(open("/home/ubuntu/inputs/cityshape/form5.pdf", "rb"))

I believe that some versions of PyPDF2 have some sort of bug, that when you invoke thePdfFileWriter.write method, it messes with the PdfFileReader instance. By recreating the PdfFileReader instance after each write, it bypasses this bug.

The following code should work (untested):

from PyPDF2 import PdfFileWriter, PdfFileReader

pdf_in_file = open("my_pdf.pdf",'rb')

inputpdf = PdfFileReader(pdf_in_file)
pages_no = inputpdf.numPages

for i in range(pages_no):
    inputpdf = PdfFileReader(pdf_in_file)
    output = PdfFileWriter()
    output.addPage(inputpdf.getPage(i))
    with open("document-page%s.pdf" % i, "wb") as outputStream:
        output.write(outputStream)

pdf_in_file.close()        

Solution 2:[2]

I solved the error "AttributeError: 'PdfFileWriter' object has no attribute 'stream'" by repeating opening the PDF.

My old code:

pdf = PdfFileReader('arq.pfd')
pagi = 14
pagf = 20
dic = PdfFileMerger()

for i in range(pagi -1, pagf):

  pag = PdfFileWriter()
  pag.addPage(pdf.getPage(i))

  with open('pag.pdf', 'wb') as split:

    pag.write(split)

  pag = PdfFileReader('pag.pdf')
  dic.append(pag)

with open(f'PDF ({pagi} - {pagf}).pdf', 'wb') as split:

  dic.write(split)

!rm pag.pdf

My new code:

pdf = PdfFileReader('arq.pdf')
pagi = 14
pagf = 20
dic = PdfFileMerger()

for i in range(pagi - 1, pagf):

  pag = PdfFileWriter()
  pag.addPage(pdf.getPage(i))

  with open('pag.pdf', 'wb') as split:

    pdf = PdfFileReader('arq.pdf') # Adding pdf again
    pag.write(split)

  pag = PdfFileReader('pag.pdf')
  dic.append(pag)

with open(f'PDF ({pagi} - {pagf}).pdf', 'wb') as split:

  dic.write(split)

!rm pag.pdf

Hugs!

Solution 3:[3]

I have this problem today. But I found so many code just like me without errors, so I think maybe just version error. I have used pypdf2 version==1.27.3, just change it version to 1.25.0, this error will fix.

pip install pypdf2==1.25.0

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 slee423
Solution 2 Wallef Santos
Solution 3 s gong