'File Path Details | Using command prompt or any script
Can anybody help me with some command prompt details/ script detail/python programming on how to get file details?
Scenario:
Folder contains many subfolders -- > how to get to know what file formats are present in the folders and how to get path of all those files.
Like, I need, distinct file names/formats/path of the files present under a folder/subfolders
Is there anyway possible to get that or manual effort will only be required?
Solution 1:[1]
To recursively list all files in folders and sub-folders in Python:
Glob [docs]
from glob import glob
glob("**", recursive=True)
OS Walk [docs]
import os
list(os.walk("./"))
Solution 2:[2]
import os, csv
import glob
import pandas as pd
import ast
dir_path = r'<path of directory>'
extension_output_path = r"<path of output file. Path where u want to save output in csv format>"
output_filenames_path = r"<path of output file. Path where u want to save output in csv format>"
exts = set(f.split('.')[-1] for dir,dirs,files in os.walk(dir_path) for f in files if '.' in f)
exts = list(set(exts))
subdirs = [x[0] for x in os.walk(dir_path)]
print(exts)
big_list = []
bigg_list = []
def FindMaxLength(lst):
maxLength = max(map(len, lst))
return maxLength
for dirs in subdirs:
split_dirs = dirs.split('\\')
big_list.append(split_dirs)
big_list_count = FindMaxLength(big_list)
for subdis in big_list:
count_val = big_list_count - len(subdis)
bigg_list.append(subdis + ['']* count_val + ['/'.join(subdis)])
output_list = []
path_list = []
for subbs in bigg_list:
big_dict = {}
for ext in exts:
tifCounter = len(glob.glob1(subbs[-1],"*."+ext))
filenames = glob.glob1(subbs[-1],"*."+ext)
if filenames != []:
val = list(map((subbs[-1]+'/').__add__,filenames))
if len(val) >1:
for li in val:
path_list.append([ext, li])
else:
path_list.append([ext]+val)
if tifCounter != 0:
big_dict[ext] = tifCounter
output_list.append(subbs+ [big_dict])
columns_row = ['col']* (big_list_count + 1)+ ['val'] + exts
with open(extension_output_path,'w', newline='') as csv_file:
csv_wr = csv.writer(csv_file)
csv_wr.writerow(columns_row)
csv_wr.writerows(output_list)
cv = pd.read_csv(extension_output_path)
for index, row in cv.iterrows():
for ext in exts:
if row['val'] != '{}' and ext in ast.literal_eval(row['val']):
cv.loc[index,ext] = ast.literal_eval(row['val'])[ext]
del cv['val']
cv.to_csv(extension_output_path, index=False)
with open(output_filenames_path,'w', newline='') as csv_file:
csv_wr = csv.writer(csv_file)
csv_wr.writerow(['extension', 'filename'])
csv_wr.writerows(path_list)
print("completed")
This output file will contain folder/subfolder path with extension's count.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Arav R |
Solution 2 | Saurabh |