'Python : Extract mails from the string of filenames
I want to get the mail from the filenames. Here is a set of examples of filenames :
string1 = "[email protected]_2022-05-11T11_59_58+00_00.pdf"
string2 = "[email protected]_test.pdf"
string3 = "[email protected]"
I would like to split the filename by the parts. The first one would contain the email and the second one is the rest. So it should give for the string2 :
['[email protected]', '_test.pdf']
I try this regex function however it does not work for the second and third string.
email = re.search(r"[a-z0-9\.\-+_]+@[a-z0-9\.\-+_]+\.[a-z]+", string)
Thank you for your help
Solution 1:[1]
Given the samples you provided, you can do something like this:
import re
strings = ["[email protected]_2022-05-11T11_59_58+00_00.pdf",
"[email protected]_test.pdf",
"[email protected]"]
pattern = r'([^@]+@[\.A-Za-z]+)(.*)'
[re.findall(pattern, string)[0] for string in strings]
Output:
[('[email protected]', '_2022-05-11T11_59_58+00_00.pdf'),
('[email protected]', '_test.pdf'),
('[email protected]', '-fdsdfsd-saf.pdf')]
Mail pattern explanation ([^@]+@[\.A-Za-z]+):
[^@]+: any combination of characters except@@: at[\.A-Za-z]+: any combination of letters and dots
Rest pattern explanation (.*)
(.*): any combination of characters
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | lemon |
