'How to create a dictionary out of my specific dataframe?

I have a dataframe df with column name:

names
phil/andy    
allen
john/william/chris
john

I want to turn it into sort of "dictionary" (pandas dataframe) with unique random number for each name:

name     value
phil      1
andy      2
allen     3
john      4
william   5
chris     6

How to do that? dataframe is sample, so I need a function to do same thing with very large dataframe



Solution 1:[1]

Here you go.

import numpy as np
import pandas as pd

# Original pd.DataFrame

d = {'phil': [1],
     'phil/andy': [2],
     'allen': [3],
     'john/william/chris': [4],
     'john': [5]
     }

df = pd.DataFrame(data=d)

# Append all names to a list

names = []

for col in df.columns:
    
    names = names + col.split("/")

# Remove duplicated names from the list

names = [i for n, i in enumerate(names) if i not in names[:n]]

# Create DF

df = pd.DataFrame(
    # Random numbers
    np.random.choice(
        len(names), # Length
        size = len(names), # Shape
        replace = False # Unique random numbers
        ),
    # Index names
    index = names,
    # Column names
    columns = ['Rand value']
    )


If you want to create a dictionary instead of a pd.DataFrame you can also apply d = df.T.to_dict() in the end. If you want numbers 0,1,2,3,...,n instead of random numbers you can replace np.random.choice() with range().

Solution 2:[2]

You can using numpy to generate random integers for those names, and you can then convert it to a dictionary using .to_dict():

import numpy as np
import pandas as pd

names_lst = ["phil", "andy", "allen", "john", "william", "chris", "john"]
df = pd.DataFrame(names_lst, columns=["name"])


df["value"] = np.random.randint(1, 6, df.shape[0])

print(df.set_index('name')["value"].to_dict())

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2