'How to create a dictionary out of my specific dataframe?
I have a dataframe df with column name:
names
phil/andy
allen
john/william/chris
john
I want to turn it into sort of "dictionary" (pandas dataframe) with unique random number for each name:
name value
phil 1
andy 2
allen 3
john 4
william 5
chris 6
How to do that? dataframe is sample, so I need a function to do same thing with very large dataframe
Solution 1:[1]
Here you go.
import numpy as np
import pandas as pd
# Original pd.DataFrame
d = {'phil': [1],
'phil/andy': [2],
'allen': [3],
'john/william/chris': [4],
'john': [5]
}
df = pd.DataFrame(data=d)
# Append all names to a list
names = []
for col in df.columns:
names = names + col.split("/")
# Remove duplicated names from the list
names = [i for n, i in enumerate(names) if i not in names[:n]]
# Create DF
df = pd.DataFrame(
# Random numbers
np.random.choice(
len(names), # Length
size = len(names), # Shape
replace = False # Unique random numbers
),
# Index names
index = names,
# Column names
columns = ['Rand value']
)
If you want to create a dictionary instead of a pd.DataFrame you can also apply d = df.T.to_dict()
in the end. If you want numbers 0,1,2,3,...,n
instead of random numbers you can replace np.random.choice()
with range()
.
Solution 2:[2]
You can using numpy to generate random integers for those names, and you can then convert it to a dictionary using .to_dict()
:
import numpy as np
import pandas as pd
names_lst = ["phil", "andy", "allen", "john", "william", "chris", "john"]
df = pd.DataFrame(names_lst, columns=["name"])
df["value"] = np.random.randint(1, 6, df.shape[0])
print(df.set_index('name')["value"].to_dict())
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 |