0

I have a dataframe where I would like to add a full address column, which would be the combination of 4 other columns (street, city, county, postalcode) from that dataframe. Example output of the address column would be:

5 Test Street, Worthing, West Sussex, RH5 3BX

Or if the city was empty as an example:

5 Test Street, West Sussex, RH5 3BX

This is my code, which after testing I see I might need to use something like apply, but I can't workout how to do it.

def create_address(street: str, city: str, county: str, postalcode: str) -> str:
    
    list_address = []
    
    if street:
        list_address.append(street)
    if city:
        list_address.append(city)
    if county:
        list_address.append(county)
    if postalcode:
        list_address.append(postalcode)

    address = ", ".join(list_address).rstrip(", ")

    return address

df["address"] = create_address(df["Street"], df["City"], df["County"], df["PostalCode"])

2 Answers 2

2

you can use lambda and apply to get the concatenated of full address

Example input

EDIT : postalcode with None

data = {
'street': ['street1', 'street2', 'street3'],
'city': ['city1', '', 'city2'],
'county': ['county1', 'county2', 'county3'],
'postalcode': ['postalcode1', 'postalcode2', '']
}

Sample code

df['full_address'] = df.apply(
    lambda row: ', '.join(filter(None, [row['street'], row['city'], row['county'], row['postalcode']])),
    axis=1
)

None is used as a filter so that unavailable elements are removed.

Output

    street   city   county   postalcode                          full_address
0  street1  city1  county1  postalcode1  street1, city1, county1, postalcode1
1  street2         county2  postalcode2         street2, county2, postalcode2
2  street3  city2  county3                            street3, city2, county3
5
  • That's amazing thank you. And if the postalcode was empty for example then the full_address entry would be street, city, county with no comma after county?
    – CGarden
    Commented Dec 31, 2024 at 18:06
  • 2
    nice method ;-)
    – rehaqds
    Commented Dec 31, 2024 at 18:06
  • 1
    Thank you both :) yes @CGarden, see the Edit I made in answer where postalcode is None, let us know how you find it
    – samhita
    Commented Dec 31, 2024 at 18:13
  • Thank you again will give it a go. Think you have a slight typo before the words sample code.
    – CGarden
    Commented Dec 31, 2024 at 19:28
  • 1
    Tested. That's working great thank you.
    – CGarden
    Commented Jan 2 at 12:34
1

There is a specific string concatenation function in Pandas for this purpose as below. The replace deals with missing entries. Such Pandas vectorized functions should be used when these are available.

df['full_address'] = df['street'].str.cat(df[['city', 'county', 'postalcode']], sep = ', ').str.replace(' ,', '').str.strip(', ')

which gives:

    street   city   county   postalcode                          full_address
0  street1  city1  county1  postalcode1  street1, city1, county1, postalcode1
1  street2         county2  postalcode2         street2, county2, postalcode2
2  street3  city2  county3  postalcode3  street3, city2, county3, postalcode3
4
  • Thank you. Looks great. I'll give it a go.
    – CGarden
    Commented Dec 31, 2024 at 19:31
  • When I tested this it was not replacing the commas where needed. This may be my error in applying it. Accepted @samhita answer. Thank you again.
    – CGarden
    Commented Jan 2 at 12:36
  • 1
    It would seem you have some entries with an empty postalcode. Answer code now edited to deal with this. As a general rule it is preferable to use efficient vectorized functions rather than apply which uses a function within a Python loop - although this does not matter for small data sets. Commented Jan 2 at 22:52
  • Ok thank you. That's great. ill try it.
    – CGarden
    Commented Jan 3 at 23:06

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.