'deleting sql code with python and selecting from JOIN

I have just started with Python and am already aware of the basics of SQL. In this code I'm trying to join tables and just test out python however I have two problems that I am not able to solve. Problem 1) Every time I run my code the data is entered a new. I understand why this is happening however to avoid this I need code that deletes duplicate data. Can somebody write a possible solution? Problem 2) When I use the the JOIN query and try to select first_name and job I recieve an error message telling me first_name is an unknown column??? This is my code:

import sqlite3
conn = sqlite3.connect('learning_sql')
c = conn.cursor()

def create_table1():
    c.execute('CREATE TABLE IF NOT EXISTS family (first_name TEXT, last_name TEXT, date_of_birth TEXT, age INTEGER)')

def create_table2():
    c.execute("CREATE TABLE IF NOT EXISTS jobs (first_name TEXT, last_name TEXT, job TEXT)")
    
def data_entry1():
    c.execute("INSERT INTO family VALUES ('Sven','Niles','31-03-2001', 16)")
    c.execute("INSERT INTO family VALUES ('Max','Niles','10-12-2005', 12)")
    c.execute("INSERT INTO family VALUES ('David','Niles','18-01-1971', 46)")
    c.execute("INSERT INTO family VALUES ('Katja', 'Niles', '16-08-1978', 39)")

def data_entry2():
    c.execute("INSERT INTO jobs VALUES ('Sven', 'Niles', 'NO')")
    c.execute("INSERT INTO jobs VALUES ('Max', 'Niles', 'NO')")
    c.execute("INSERT INTO jobs VALUES ('David', 'Niles', 'YES')")
    c.execute("INSERT INTO jobs VALUES ('Katja', 'Niles', 'YES')")

def read_from_db():
    c.execute("SELECT first_name, job FROM family JOIN jobs ON jobs.first_name = family.first_name WHERE job = 'YES'")
    data = c.fetchall()
    print (data)
    for row in data:
        print (row)

create_table1()
create_table2()
data_entry1()
data_entry2()
read_from_db()

this is the error message I am experiencing:

Traceback (most recent call last):
  File "/home/pi/Desktop/praciticing_sql.py", line 34, in <module>
    read_from_db()
  File "/home/pi/Desktop/praciticing_sql.py", line 24, in read_from_db
    c.execute("SELECT first_name, job FROM family JOIN jobs ON jobs.first_name = family.first_name WHERE job = 'YES'")
sqlite3.OperationalError: ambiguous column name: first_name


Solution 1:[1]

You have two tables in your query with the field "first name". Your query needs to specify which one you want to reference.

Replace your select line with

SELECT jobs.first_name, job FROM family....

Solution 2:[2]

You are already using CREATE TABLE IF NOT EXISTS, but it might be easier to wrap all four creating/insertion functions into a check for the database contents:

if not table_exists(conn, "family"):
    conn.execute("BEGIN")
    with conn:
        create_table1()
        create_table2()
        data_entry1()
        data_entry2()

read_from_db()

The column name first_name is ambiguous because it occurs in both tables. When you join using USING instead, the database will automatically remove the duplicate column:

c.execute("""SELECT first_name, job
             FROM family
             JOIN jobs USING (first_name)
             WHERE job = 'YES'""")

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Greynerd
Solution 2 CL.