Exercises with strings#

First functions#

length#

✪ a. Write a function length1(s) in which, given a string, RETURN the length of the string. Use len function. For example, with "ciao" string your function should return 4 while with "hi" it should return 2

Hide code cell content
# version with len, faster because python with a string always maintains in memory 
# the number of length immediately available
def length1(s):
    return len(s)
length1("ciao")
4

✪ b. Write a function length2 that like before calculates the string length, this time without using len (instead, use a for cycle)

Hide code cell content
# version with counter, slower
def length2(s):
    counter = 0
    for character in s:
        counter = counter + 1
    return counter
length2("mondo")
5

contains#

✪ Write the function contains(word, character), which RETURN True is the string contains the given character, otherwise RETURN False

  • Use in operator

Hide code cell content
def contains(word, character):
    return character in word
contains('ciao', 'a')
True
contains('ciao', 'z')
False

invertlet#

✪ Write the function invertlet(first, second) which takes in input two strings of length greater than 3, and RETURN a new string in which the words are concataned and separated by a space, the last two characters in the words are inverted. For example, if you pass in input 'twist' and 'space', the function should RETURN 'twise spact'

  • If the two strings are not of adequate length, the program PRINTS error!

NOTE 1: PRINTing is different from RETURNing !!! Whatever gets printed is shown to the user but Python cannot reuse it for calculations.

NOTE 2: if a function does not explicitly return anything, Python implicitly returns None.

NOTE 3: Resorting to prints on error conditions is actually bad practice: this is an invitation to think about what happens when you print something and do not return anything. You can read a discussion about it in Error handling and testing solutions

Hide code cell content
def invertlet(first,second):
    if len(first) <= 3 or len(second) <=3:
        print("error!")
    else:
        return first[:-1] + second[-1] + " " + second[:-1] + first[-1]
print(invertlet("twist", "space"))
print(invertlet("fear", "me"))
print(invertlet("so", "bad"))
twise spact
error!
None
error!
None

nspace#

✪ Write a function nspace that given a string s in input, RETURN a new string in which the n-character is a space.

  • if the number is too big, raise the exception ValueError - in the exception message state clearly what the problem was and the input.

NOTE: This time instead of printing the error we raise the exception, which will prevent the program from continuing further. This is a much better way to react to erroneous conditions.

Hide code cell content
def nspace(word, index):
    if index >= len(word):
        raise ValueError("index %s is larger than word %s" % (index, word))
    return word[:index] + ' ' + word[index+1:]
nspace("allegory", 5)
'alleg ry'
nspace('toy', 9)
Hide code cell output
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[12], line 1
----> 1 nspace('toy', 9)

Cell In[10], line 3, in nspace(word, index)
      1 def nspace(word, index):
      2     if index >= len(word):
----> 3         raise ValueError("index %s is larger than word %s" % (index, word))
      4     return word[:index] + ' ' + word[index+1:]

ValueError: index 9 is larger than word toy

startend#

✪ Write a function which takes a string s and RETURN the first and last two characters

  • if length is less than 4, raises ValueError - in the exception message state clearly what the problem was and the input

Hide code cell content
def startend(s):
    if len(s) < 4:
        raise ValueError("I need at least 4 characters, got instead: %s" % s)
    return s[:2] + s[-2:]     
startend('robust pack')
'rock'
startend('sig')
Hide code cell output
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[15], line 1
----> 1 startend('sig')

Cell In[13], line 3, in startend(s)
      1 def startend(s):
      2     if len(s) < 4:
----> 3         raise ValueError("I need at least 4 characters, got instead: %s" % s)
      4     return s[:2] + s[-2:]

ValueError: I need at least 4 characters, got instead: sig

swap#

Write a function that given a string, swaps the first and last character and RETURN the result.

  • if the string is empty, raise ValueError - in the exception message state clearly the cause of the problem

Hide code cell content
def swap(s):
    if s == '':
        raise ValueError("Empty string!")
    return s[-1] + s[1:-1] + s[0]
print(swap('dream'))
print(swap('c'))
#
mread
cc
swap('')
Hide code cell output
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[18], line 1
----> 1 swap('')

Cell In[16], line 3, in swap(s)
      1 def swap(s):
      2     if s == '':
----> 3         raise ValueError("Empty string!")
      4     return s[-1] + s[1:-1] + s[0]

ValueError: Empty string!

Verify comprehension#

has_char#

✪ RETURN True if word contains char, False otherwise

  • USE a while cycle

  • DON’T use in operator nor methods such as .count (too easy!)

Hide code cell content
def has_char(word, char):
    index = 0  # initialize index
    while index < len(word):
        if word[index] == char:
            return True  # we found the character, we can stop search
        index += 1  # it is like writing index = index + 1
    # if we arrive AFTER the while, there is only one reason:
    # we found nothing, so we have to return False
    return False
assert has_char("ciao", 'a')
assert not has_char("ciao", 'A')
assert has_char("ciao", 'c') 
assert not has_char("", 'a')
assert not has_char("ciao", 'z')

count#

✪ RETURN the number of occurrences of char in word

  • USE a for in cycle

  • DON’T use count method (too easy!)

  • DON’T print, it must return the value !

Hide code cell content
def count(word, char):
    occurrences = 0
    for c in word:
        if c == char:
            occurrences += 1
    return occurrences
assert count("ciao", "z") == 0 
assert count("ciao", "c") == 1
assert count("babbo", "b") == 3
assert count("", "b") == 0
assert count("ciao", "C") == 0 

has_lower#

✪ RITORNA True if the word contains at least one lowercase character, otherwise return False

  • USE a while cycle

Hide code cell content
def has_lower(s):
    i = 0
    while i < len(s):
        if s[i] == s[i].lower():
            return True
        i += 1
    return False
assert has_lower("daviD")
assert not has_lower("DAVID")
assert not has_lower("")
assert has_lower("a")
assert not has_lower("A")

dialect#

✪✪ There exist a dialect in which all the "a" must be always preceded by a "g". In case a word contains an "a" not preceded by a "g", we can say with certainty that this word does not belong to the dialect. Write a function that given a word, RETURN True if the word respects the rules of the dialect, False otherwise.

Hide code cell content
def dialect(word):
    for i in range(0,len(word)):
        if word[i] == "a":
            if i == 0 or word[i - 1] != "g":
                return False
    return True
assert dialect("a") == False
assert dialect("ab") == False
assert dialect("ag") == False
assert dialect("ag") == False
assert dialect("ga") == True
assert dialect("gga") == True
assert dialect("gag") == True
assert dialect("gaa") == False
assert dialect("gaga") == True
assert dialect("gabga") == True
assert dialect("gabgac") == True
assert dialect("gabbgac") == True
assert dialect("gabbgagag") == True

countvoc#

✪✪ Given a string, write a function that counts the number of vocals. If the vocals number is even, RETURN the number of vocals, otherwise raises exception ValueError

Hide code cell content
def countvoc(word):
    n_vocals = 0

    vocals = ["a","e","i","o","u"]

    for char in word:
        if char.lower() in vocals:
            n_vocals = n_vocals + 1

    if n_vocals % 2 == 0:
        return n_vocals
    else:
        raise ValueError("Odd vocals !")
assert countvoc("arco") == 2
assert countvoc("scaturire") == 4

try:
    countvoc("ciao")    # with this string we expect it raises exception ValueError
    raise Exception("I shouldn't arrive until here !")
except ValueError:      # if it raises the exception ValueError, it is behaving as expected and we do nothing
    pass

try:
    countvoc("aiuola")  # with this string we expect it raises exception ValueError
    raise Exception("I shouldn't arrive until here  !")
except ValueError:      # if it raises the exception ValueError, it is behaving as expected and we do nothing
    pass

extract_email#

Hide code cell content
def extract_email(s):
    """ Takes a string s formatted like 
    
        "lun 5 nov 2018, 02:09 John Doe <john.doe@some-website.com>"
        
        and RETURN the email "john.doe@some-website.com"
        
        NOTE: the string MAY contain spaces before and after, but your function must be able to extract email anyway.
        
        If the string for some reason is found to be ill formatted, raises ValueError
    """
    stripped = s.strip()
    i = stripped.find('<')
    return stripped[i+1:len(stripped)-1]
assert extract_email("lun 5 nov 2018, 02:09 John Doe <john.doe@some-website.com>") == "john.doe@some-website.com"
assert extract_email("lun 5 nov 2018, 02:09 Foo Baz <mrfoo.baz@blabla.com>") == "mrfoo.baz@blabla.com"
assert extract_email(" lun 5 nov 2018, 02:09 Foo Baz <mrfoo.baz@blabla.com>  ") == "mrfoo.baz@blabla.com"  # with spaces

canon_phone#

✪ Implement a function that canonicalize canonicalize a phone number as a string. It must RETURN the canonical version of phone as a string.

For us, a canonical phone number:

  • contains no spaces

  • contains no international prefix, so no +39 nor 0039: we assume all calls where placed from Italy (even if they have international prefix)

For example, all of these are canonicalized to "0461123456":

+39 0461 123456
+390461123456
0039 0461 123456
00390461123456

These are canonicalized as the following:

328 123 4567        ->  3281234567
0039 328 123 4567   ->  3281234567
0039 3771 1234567   ->  37711234567

REMEMBER: strings are immutable !!!!!

Hide code cell content
def phone_canon(phone):
    p = phone.replace(' ', '')
    if p.startswith('0039'):
        p = p[4:]
    if p.startswith('+39'):
        p = p[3:]
    return p
assert phone_canon('+39 0461 123456') == '0461123456'
assert phone_canon('+390461123456') == '0461123456'
assert phone_canon('0039 0461 123456') == '0461123456'
assert phone_canon('00390461123456') == '0461123456'
assert phone_canon('003902123456') == '02123456'
assert phone_canon('003902120039') == '02120039'
assert phone_canon('0039021239') == '021239'

phone_prefix#

✪✪ We now want to extract the province prefix from phone numbers (see previous exercise) - the ones we consider as valid are in province_prefixes list.

Note some numbers are from mobile operators and you can distinguish them by prefixes like 328 - the ones we consider are in mobile_prefixes list.

Implement a function that RETURN the prefix of the phone as a string. Remember first to make it canonical !!

  • If phone is mobile, RETURN string 'mobile'. If it is not a phone nor a mobile, RETURN the string 'unrecognized'

  • To determine if the phone is mobile or from province, use province_prefixes and mobile_prefixes lists.

  • DO USE THE PREVIOUSLY DEFINED FUNCTION phone_canon(phone)

province_prefixes = ['0461', '02', '011']
mobile_prefixes = ['330', '340', '328', '390', '3771']
Hide code cell content
def phone_prefix(phone):
    c = phone_canon(phone)
    for m in mobile_prefixes:
        if c.startswith(m):
            return 'mobile'
    for p in province_prefixes:
        if c.startswith(p):
            return p
    return 'unrecognized'
assert phone_prefix('0461123') == '0461'
assert phone_prefix('+39 0461  4321') == '0461'
assert phone_prefix('0039011 432434') == '011'
assert phone_prefix('328 432434') == 'mobile'
assert phone_prefix('+39340 432434') == 'mobile'
assert phone_prefix('00666011 432434') == 'unrecognized'
assert phone_prefix('12345') == 'unrecognized'
assert phone_prefix('+39 123 12345') == 'unrecognized'

palindrome#

✪✪✪ A word is palindrome if it exactly the same when you read it in reverse

Write a function the RETURN True if the given word is palindrome, False otherwise

  • assume that the empty string is palindrome

There are various ways to solve this problems, some actually easy & elegant. Try to find at least a couple of them.

Hide code cell content
def palindrome(word):
    for i in range(len(word) // 2):
        if word[i] != word[len(word)- i - 1]: 
            return False
                  
    # note how the return is outside the for loop: after passing all controls, we can
    # conclude that the word it is actually palindrome
    return True   
assert palindrome('') == True    # we assume the empty string is palindrome
assert palindrome('a') == True
assert palindrome('aa') == True
assert palindrome('ab') == False
assert palindrome('aba') == True
assert palindrome('bab') == True
assert palindrome('bba') == False
assert palindrome('abb') == False
assert palindrome('abba') == True
assert palindrome('baab') == True
assert palindrome('abbb') == False
assert palindrome('bbba') == False
assert palindrome('radar') == True
assert palindrome('abstruse') == False