'Python : Extract mails from the string of filenames
I want to get the mail from the filenames. Here is a set of examples of filenames :
string1 = "[email protected]_2022-05-11T11_59_58+00_00.pdf"
string2 = "[email protected]_test.pdf"
string3 = "[email protected]"
I would like to split the filename by the parts. The first one would contain the email and the second one is the rest. So it should give for the string2 :
['[email protected]', '_test.pdf']
I try this regex function however it does not work for the second and third string.
email = re.search(r"[a-z0-9\.\-+_]+@[a-z0-9\.\-+_]+\.[a-z]+", string)
Thank you for your help
Solution 1:[1]
Given the samples you provided, you can do something like this:
import re
strings = ["[email protected]_2022-05-11T11_59_58+00_00.pdf",
"[email protected]_test.pdf",
"[email protected]"]
pattern = r'([^@]+@[\.A-Za-z]+)(.*)'
[re.findall(pattern, string)[0] for string in strings]
Output:
[('[email protected]', '_2022-05-11T11_59_58+00_00.pdf'),
('[email protected]', '_test.pdf'),
('[email protected]', '-fdsdfsd-saf.pdf')]
Mail pattern explanation ([^@]+@[\.A-Za-z]+)
:
[^@]+
: any combination of characters except@
@
: at[\.A-Za-z]+
: any combination of letters and dots
Rest pattern explanation (.*)
(.*)
: any combination of characters
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | lemon |