'Copy file handle so that there are two independent handles to the same file

I have a Python program which does the following:

  • It takes a list of files as input
  • It iterates through the list several times, each time opening the files and then closing them

What I would like is some way to open each file at the beginning, and then when iterating through the files make a copy of each file handle. Essentially this would take the form of a copy operation on file handles that allows a file to be traversed independently by multiple handles. The reason for wanting to do this is because on Unix systems, if a program obtains a file handle and the corresponding file is then deleted, the program is still able to read the file. If I try reopening the files by name on each iteration, the files might have been renamed or deleted so it wouldn't work. If I try using f.seek(0), then that might affect another thread/generator/iterator.

I hope my question makes sense, and I would like to know if there is a way to do this.



Solution 1:[1]

If you really want to get a copy of a file handle, you would need to use POSIX dup system call. In python, that would be accessed by using os.dup - see docs. If you have a file object (e.g. from calling open()), then you need to call fileno() method to get file descriptor.

So the entire code will look like this:

with open("myfile") as f:
    fd = f.fileno()       # get descriptor
    fd2 = os.dup(fd)      # duplicate descriptor
    f2 = os.fdopen(fd2)   # get corresponding file object

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Jerzy