'Extract house numbers from address string

I am importing user data from a foreign database on demand. While i keep house numbers separate from the street names, the other database does not.

I use

preg_match_all('!\d+!')

To rule out the numbers. This works fine for an addressline like this:

streetname 60

But it does not work for an addressline like this:

streetname 60/2/3

In that case i end up extracting 60, and /2/3 stay in the name of the street.

Unfortunately i am not a regex expert. Quite to the contrary. My problem is that i need to be able to not only detect numerics, but also slashes and hyphens.

Can someone help me out here?



Solution 1:[1]

Try:

preg_match_all('![0-9/-]+!', 'streetname 60/2/3', $matches);

Solution 2:[2]

to give a definite answer we would have to know the patterns in your data.

for example, in Germany we sometimes have house numbers like 13a or 23-42, which could also be written as 23 - 42

one possible solution would be to match everything after a whitespace that starts with a digit

preg_match_all('!\s(\d.*)!', 'streetname 60/2/3', $matches);

this would produce false positives, though, if you have American data with streets like 13street

Solution 3:[3]

This approach does not use Regex. Will only return when it sees the first number, exploded by space. Ideal for addresses like e.g. 12 Street Road, Street Name 1234B

function getStreetNumberFromStreetAddress($streetAddress){
  $array = explode(' ',$streetAddress);

  if (count($array) > 0){
    foreach($array as $a){
      if (is_numeric($a[0])){
        return $a;
      }
    }
  }
  return null;
}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 HamZa
Solution 2 cypherabe
Solution 3 Pearce