Data Structures

This chapter describes some things you’ve learned about already in more detail, and adds some new things as well.

5.1. More on Lists

  • list.extend(iterable), extend the list by appending all the items from the iterable.

  • list.insert(i,x), Insert an item before list[i], if the given position out of the range, it will be inserted at the front or the rear

    1
    2
    3
    4
    >>> ls = []
    >>> ls.insert(100 , 0)
    >>> ls
    [0]
  • list.remove(x), remove the first item from the list whose value is equal to x. It raises a ValueError if there is no such item.

    1
    2
    3
    >>> ls.remove(0)
    >>> ls
    []
  • list.pop([i]),Remove the item at the given position in the list, and return it. If no index is specified, i = len(list)-1.

    1
    2
    3
    4
    5
    6
    >>> ls
    [0, 1, 2, 3, 4]
    >>> ls.pop(2)
    2
    >>> ls
    [0, 1, 3, 4]
  • list.clear(), Remove all items from the list. Equivalent to del a[:].

  • list.index(x[, start , end]),Return zero-based index in the list of the first item whose value is equal to x. Raises a ValueError if there is no such item. The optional start and end is used to make a subsequence via slicing.

    1
    2
    3
    4
    5
    6
    >>> ls
    [0, 1, 3, 0, 1, 3]
    >>> ls.index(1)
    1
    >>> ls.index(1,2,6)
    4
  • list.count(x), Return the number of times x appears in the list.

  • list.sort(* , key = None , reverse = False),Sort the items of the list in place,

    key is a function used to costomize the comparison,(anonymous function is handy here), reverse is a boolen value, decide the bigger one should be at the right or left.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    >>> ls = ['zazzzzz' , 'ybyyyyy' , 'ocoooo' ] 
    >>> ls.sort()
    >>> ls
    ['ocoooo', 'ybyyyyy', 'zazzzzz']
    >>> ls.sort(key = lambda ll : ll[1])
    >>> ls
    ['zazzzzz', 'ybyyyyy', 'ocoooo']
    >>> ls.sort(key = lambda ll : ll[1] , reverse = True)
    >>> ls
    ['ocoooo', 'ybyyyyy', 'zazzzzz']
  • list.reverse(), Reverse the elements of the list in place.

  • list.copy(), Return a shallow copy of the list. Equivalent to a[:].

5.1.1. Using Lists as Stacks

use list.append(e) like stack.push(e), use list.pop() like stack.pop()

5.1.2. Using Lists as Queues

  • use list.insert(0,e) like queue.out(), use list.append(e) like queue.in(e)

  • lists are NOT efficient for this purpose, doing inserts or pops from the beginning of a list have to shift all other elements by one.

  • To implement a queue, use collections.deque

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    >>> from collections import deque
    >>> queue = deque(["Eric", "John", "Michael"])
    >>> queue.append("Terry") # Terry arrives
    >>> queue.append("Graham") # Graham arrives
    >>> queue.popleft() # The first to arrive now leaves
    'Eric'
    >>> queue.popleft() # The second to arrive now leaves
    'John'
    >>> queue # Remaining queue in order of arrival
    deque(['Michael', 'Terry', 'Graham'])

    5.1.3. List Comprehensions

List comprehensions provide a concise way to create lists.

(Try to be pythonic😎)

  • A list comprehension consists of brackets containing an expression followed by a for clause, then zero or more for or if clauses.

  • The result will be a new list resulting from evaluating the expression in the context of the for and if clauses which follow it.

    1
    2
    3
    >>> [(x, y) for x in [1,2,3] for y in [3,1,4] if x != y]
    [(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]
    # If the expression is a tuple, it must be parenthesized
    1
    2
    3
    4
    5
    6
    # flatten a muti-dimension list
    >>> vec
    [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
    >>> flatten = [x for l in vec for x in l]
    >>> flatten
    [1, 2, 3, 4, 5, 6, 7, 8, 9]

    5.1.4. Nested List Comprehensions

The initial expression in a list comprehension can be any arbitrary expression, including another list comprehension.

1
2
3
4
5
6
7
8
>>> matrix = [
... [1, 2, 3, 4],
... [5, 6, 7, 8],
... [9, 10, 11, 12],
... ]
>>> [[row[i] for row in matrix] for i in range(4)]
[[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]
# zip() is good at this kind of job than nested listcomp

5.2. The del statement

  • the del statement can be used to deletet list items and slice of list.
  • del can also be used to delete entire variables
  • ….

5.3. Tuples and Sequences

  • tuples are immutable, but they can contain mutable elements, for example lists

  • they may be input with or without surrounding parentheses

  • tuple vs list

    Data Structure Tuple List
    mutability immutable mutable
    element type heterogeneous homogeneous
    way to access elements unpacking & indexing Iterating
  • crate tuple

    1
    2
    3
    4
    5
    6
    7
    8
    >>> tp = () # tuple without elements
    >>> type(tp)
    <class 'tuple'>
    >>> tp = 'e' , # tuple with one element , ugly but effective
    >>> type(tp)
    <class 'tuple'>
    >>> tp
    ('e',)
  • tuple packing and unpacking

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    >>> t = 1 ,2 , 3 ,4 # packing into t
    >>> t
    (1, 2, 3, 4)
    >>> a,b,c,d = t # unpacking t , the number of variables must be len(t)
    >>> a
    1
    >>> b
    2
    >>> c
    3
    >>> d
    4
  • multiple assignment is really just a combination of tuple packing and sequence unpacking.

5.4. Sets

A set is an unordered collection with no duplicate elements.

  • membership testing and eliminating duplicate entries

  • support mathematical operations like union, intersection, difference, and symmetric difference

  • Demonstration

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    >>> basket = {'apple', 'orange', 'apple', 'pear', 'orange', 'banana'}
    >>> print(basket) # show that duplicates have been removed
    {'orange', 'banana', 'pear', 'apple'}
    >>> 'orange' in basket # fast membership testing
    True
    >>> 'crabgrass' in basket
    False

    >>> # Demonstrate set operations on unique letters from two words
    ...
    >>> a = set('abracadabra')
    >>> b = set('alacazam')
    >>> a # unique letters in a
    {'a', 'r', 'b', 'c', 'd'}
    >>> a - b # letters in a but not in b
    {'r', 'd', 'b'}
    >>> a | b # letters in a or b or both
    {'a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'}
    >>> a & b # letters in both a and b
    {'a', 'c'}
    >>> a ^ b # letters in a or b but not both
    {'r', 'd', 'b', 'm', 'z', 'l'}
  • set comprehensions are also supported

    1
    2
    3
    >>> a = {x for x in 'abracadabra' if x not in 'abc'}
    >>> a
    {'r', 'd'}

    5.5. Dictionaries

  • dictionaries are indexed by keys, which can be any immutable type(tuples without mutable elements can be keys)

  • It is also possible to delete a key:value pair with del

  • store using a key that is already in use, the old value associated with that key is forgotten.

  • To check whether a single key is in the dictionary, use the in keyword.

  • use list(d) to get a list of keys, use sorted(d) to get a list of values

  • dictionary comprehensions are also supported

    1
    2
    >>> {x: x**2 for x in (2, 4, 6)}
    {2: 4, 4: 16, 6: 36}

    5.6. Looping Techniques

  • To loop over two or more sequences at the same time, the entries can be paired with the zip() function.

    1
    2
    3
    4
    5
    6
    7
    8
    >>> questions = ['name', 'quest', 'favorite color']
    >>> answers = ['lancelot', 'the holy grail', 'blue']
    >>> for q, a in zip(questions, answers):
    ... print('What is your {0}? It is {1}.'.format(q, a))
    ...
    What is your name? It is lancelot.
    What is your quest? It is the holy grail.
    What is your favorite color? It is blue.
  • To loop over a sequence in reverse, first specify the sequence in a forward direction and then call the reversed() function.

  • To loop over a sequence in sorted order, use the sorted() function which returns a new sorted list while leaving the source unaltered.

  • Using set() on a sequence eliminates duplicate elements.

5.7. More on Conditions

  • The operators is and is not compare whether two objects are really the same object; this only matters for mutable objects like lists.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    >>> ls = [ 1 ,2 ,3 ]
    >>> ll = ls.copy()
    >>> ll
    [1, 2, 3]
    >>> ls
    [1, 2, 3]
    >>> ls is ll
    False
    >>> id(ls)
    4421096768
    >>> id(ll)
    4421064448
    >>> L = ls
    >>> id(L)
    4421096768
    >>> L is ls
    True
  • Comparisons can be chained. For example, a < b == c tests whether a is less than b and moreover b equals c.(😎)

    1
    2
    >>> 1 <= 2 <= 3 # this should be 1 <= 2 && 2 <= 3 in C
    True
  • and and or are so-called short-circuit operators, like C. And the return value of a short-circuit operator is the last evaluated argument.

    1
    2
    3
    >>> res = 1 > 2 and 1 == 1 and 2 ==2 
    >>> res
    False
  • Note that in Python, unlike C, assignment inside expressions must be done explicitly with the walrus operator :=. This avoids a common class of problems encountered in C programs: typing = in an expression when == was intended.

    Python version should >= 3.8

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    >>> while a := (b :=b - 1 ):
    ... print(a ,b )
    ...
    9 9
    8 8
    7 7
    6 6
    5 5
    4 4
    3 3
    2 2
    1 1

    5.8. Comparing Sequences and Other Types

  • comparable sequences should have the same sequence type

  • lexicographical order is used

    1
    2
    3
    4
    5
    6
    7
    (1, 2, 3)              < (1, 2, 4)
    [1, 2, 3] < [1, 2, 4]
    'ABC' < 'C' < 'Pascal' < 'Python'
    (1, 2, 3, 4) < (1, 2, 4)
    (1, 2) < (1, 2, -1)
    (1, 2, 3) == (1.0, 2.0, 3.0)
    (1, 2, ('aa', 'ab')) < (1, 2, ('abc', 'a'), 4)
  • if the type is different, a TypeError will be raised

    1
    2
    3
    4
    5
    6
    7
    8
    >>> s = set()
    >>> ls = []
    >>> s == ls
    False
    >>> s < ls
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: '<' not supported between instances of 'set' and 'list'