'Is there a way or package to clean French postal code data in Stata?

I got a task for work regarding Stata, there I need to clean zip code data. 71000 is for example Paris 71001 is only a part of Paris. In my task there are firms with the same id and adress but one zip code is exact(71001) the other one is only the city(71000). For my task having the 71000 is fully sufficient - is there any package making this easier?



Solution 1:[1]

I think something like this might work:

clear

input str5 zip
71000 
71001
end

gen zip2 = substr(zip,1,2) + "000"

Here the first two digits correspond to département, so we leave them alone. The last three digits identify a more precise location, so replacing that with 000 maps it to the préfecture.

There are some exceptions to this, but hopefully, your data is limited to the more straightforward case.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 dimitriy