'Not able to fetch <h3> ag from the below website using Beautiful Soup

I'm trying to fetch top 100 movie names, but not able to access h3 tag.How can I fetch it from this link? Edit - Using below code to extract h3 -

import requests
from bs4 import BeautifulSoup

request = requests.get("https://www.empireonline.com/movies/features/best-movies-2/")
movie_data = request.text
movie_soup = BeautifulSoup(movie_data, "html.parser")
movie = movie_soup.find("h3", class_="jsx-4245974604")
print(movie)


Solution 1:[1]

The articles are embedded within the page in Json format. You can use this example how to extract the movie names:

import json
import requests
from bs4 import BeautifulSoup


url = "https://www.empireonline.com/movies/features/best-movies-2/"

soup = BeautifulSoup(requests.get(url).content, "html.parser")
data = json.loads(soup.select_one("#__NEXT_DATA__").contents[0])

# uncomment this to print all data:
# print(json.dumps(data, indent=4))


def find_articles(data):
    if isinstance(data, dict):
        for k, v in data.items():
            if k.startswith("ImageMeta:"):
                yield v["titleText"]
            else:
                yield from find_articles(v)
    elif isinstance(data, list):
        for i in data:
            yield from find_articles(i)


for a in find_articles(data):
    print(a)

Prints:

100) Stand By Me
99) Raging Bull
98) Amelie
97) Titanic
96) Good Will Hunting
95) Arrival
94) Lost In Translation
2) The Princess Bride
92) The Terminator
91) The Prestige
90) No Country For Old Men
89) Shaun Of The Dead
88) The Exorcist
87) Predator
86) Indiana Jones And The Last Crusade
85) Léon
84) Rocky
83) True Romance
82) Some Like It Hot
81) The Social Network
15) Spirited Away
79) Captain America: Civil War
78) Oldboy
77) Toy Story
76) A Clockwork Orange
75) Fargo
74) Mulholland Dr.
73) Seven Samurai
72) Rear Window
71) Hot Fuzz
70) The Lion King
69) Singin' In The Rain
68) Ghostbusters
67) Memento
66) Return Of The Jedi
65) Avengers Assemble
64) L.A. Confidential
63) Donnie Darko
62) La La Land
61) Forrest Gump
60) American Beauty
59) E.T. – The Extra Terrestrial
58) Inglourious Basterds
57) Whiplash
56) Reservoir Dogs
55) Pan's Labyrinth
54) Vertigo
53) Psycho
52) Once Upon A Time In The West
51) It's A Wonderful Life
50) Lawrence Of Arabia
Trainspotting
48) The Silence Of The Lambs
47) Interstellar
46) Citizen Kane
45) Drive
44) Gladiator
43) One Flew Over The Cuckoo's Nest
42) There Will Be Blood
41) Eternal Sunshine Of The Spotless Mind
40) 12 Angry Men
39) Saving Private Ryan
38) Mad Max: Fury Road
37) The Thing
36) The Departed
35) The Shining
34) Guardians Of The Galaxy
33) Schindler's List
32) The Usual Suspects
31) Taxi Driver
30) Seven
29) The Big Lebowski
28) Casablanca
27) The Good, The Bad And The Ugly
26) Heat
25) Terminator 2: Judgment Day
24) The Matrix
23) The Lord Of The Rings: The Two Towers
22) Apocalypse Now
21) 2001: A Space Odyssey
20) Die Hard
19) Jurassic Park
18) Inception
17) Fight Club
16) The Lord Of The Rings: The Return Of The King
15) Aliens
14) Alien
13) Blade Runner
12: The Godfather Part II
11) Back To The Future
10) The Lord Of The Rings: The Fellowship Of The Ring
9) Star Wars
8) Jaws
7) Raiders Of The Lost Ark
6) Goodfellas
5) Pulp Fiction
4) The Shawshank Redemption
3) The Dark Knight
2) The Empire Strikes Back
The Godfather

Solution 2:[2]

import json
import requests
from bs4 import BeautifulSoup
response = requests.get("https://www.empireonline.com/movies/features/best-movies-2/")
movies_webpage = response.text
soup = BeautifulSoup(movies_webpage, "html.parser")
data = json.loads(soup.select_one(selector="#__NEXT_DATA__").contents[0])
data_dict = data["props"]["pageProps"]["apolloState"]
for a, b in data_dict.items():
    if a.startswith("ImageMeta"):
        print(b["titleText"])

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Andrej Kesely
Solution 2 Mbuthi Mungai