'Creating new pandas columns from substrings in a list
I have data in a csv called 'Features' which is of this form:
0 [Shops: Close by, Passing trade: Yes]
1 [Lift: Yes, No of Bedrooms: 1, Bedroom 1 Dims:...
2 [Lift: Yes, No of Bedrooms: 2, Bedroom 1 Dims:...
3 [No of Bedrooms: 4, Bedroom 1 Dims: 4.80 x 5.0...
4 [Finish: Excellent, Airconditioning: Yes, Shop...
...
and would like to create new pandas columns for the number of bedrooms.
0 [N/A]
1 [1]
2 [2]
3 [4]
4 [N/A]
...
I have tried something this like in python:
csvname['No of Bedrooms'] = [s for s in csvname['Features'] if 'No of Bedrooms' in s]
This did not work. Is there an easy way of doing this? Any help would be greatly appreciated.
Solution 1:[1]
You can try .str.extract
csvname['No of Bedrooms'] = csvname['Features'].astype(str).str.extract('No of Bedrooms: (\d+)')
print(csvname)
Features No of Bedrooms
0 [Shops: Close by, Passing trade: Yes] NaN
1 [Lift: Yes, No of Bedrooms: 1, Bedroom 1 Dims:... 1
2 [Lift: Yes, No of Bedrooms: 2, Bedroom 1 Dims:... 2
3 [No of Bedrooms: 4, Bedroom 1 Dims: 4.80 x 5.0... 4
4 [Finish: Excellent, Airconditioning: Yes, Shop... NaN
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |