'create a list from given data to use in read_fwf

in load_fwf the parameter colspecs assigned as a list like this example

data2 = pd.read_fwf("sample.txt",index_col='Order number',names=['Order number', 'code', 'valid(1)/not-valid(0) transactions','Short description','long description'], engine='python', colspecs=[(0,5),(6,13),(14,15),(16,76),(77,400)], header=None)

so I have a data of set number positional width of each column like (example)

 ____________________________________________
 |position|length|column name               |
 |1       |5     |code                      |
 |6       |20    |short description         |
______________________________________________

so know how to code it so I could extract the data and create into list like this [(0,5),(6,20)]

additional information

    Position    Length  Contents
0   1   5   Order number
1   6   1   Blank
2   7   7   code
3   14  1   Blank
4   15  1   valid(1)/not-valid(0) transactions
5   16  1   Blank
6   17  60  Short description
7   77  1   Blank
8   78  400 Long description

if I use your suggestion it will give the output like

[(1, 5),
 (6, 1),
 (7, 7),
 (14, 1),
 (15, 1),
 (16, 1),
 (17, 60),
 (77, 1),
 (78, 400)]

but I want [(1,5),(6,13),(14,15),(16,76),(77,400)]

here i combined the blank to the next row in the table but for this example I want it like this ​



Solution 1:[1]

You could use zip for that:

import pandas as pd

data2 = pd.DataFrame({"position": [1, 6], "length": [5, 20], "column name": ["code", "short description"]})
print(data2)
#    position  length        column name
# 0         1       5               code
# 1         6      20  short description

data3 = list(zip(data2.position, data2.length))
print(data3)
# [(1, 5), (6, 20)]

zip returns an iterator. So if you just want to iterate over it later (e.g. in a for loop), you don't need to convert it to a list:

data4 = zip(data2.position, data2.length)
for x in data4:
   print(x)
# (1, 5)
# (6, 20)

Edit:

So with the blank lines and the irregularity in the last line, I would use a different approach:

colspecs = [
    (row["Position"], row["Position"] + row["Length"]) for i, row in data2.iterrows() if row["Contents"] != "Blank"
]

colspecs[-1] = (colspecs[-1][0], data2.Length.values[-1])

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1