Background

今天在網路上面剛好看到有人分享了這篇文章 Let’s talk about data structures in Python 看完了之後覺得蠻不錯的,所以決定寫個Blog簡單的翻譯這篇文章中提到的東西。

Primitive and non-primitive data structures

資料結構可以簡單地被分為兩種形式: primitive 和 non-primitive。

  • Primitive:

    • Integers
    • Floats
    • Strings
    • Booleans
  • Non-primitive:

    • Lists
    • Arrays
    • Tuples
    • Dicts
    • Sets

Some of the common Big-O’s

Screen-Shot-2019-07-09-at-12.29.05-AM

這程式的世界裡面我們常會用 Big-O 來表達時間&空間使用的優劣程度。上表給大家做參考。

Primitive data structures

  • Booleans are integers in python
>>> isinstance(True, int)
True
>>> isinstance(False, int)
True
>>> True == 1 == 1.0 and False == 0 == 0.0
True
>>> True + 5
6

作者特別提到在 Python 中 Booleans 是 integers 的型態,並舉了幾段程式來做解釋。

List

  • List is a sequence of elements. It can store anything: numbers, strings, other lists, functions and etc.

Python中的List很方便,基本上可以存任何東西,當然這會花費比較多的記憶體。

  • pop() and pop(0). What do you think will be faster? To remove first or last element?
  • In python list is a dynamic array.
  • When you remove an element from the front of the dynamic array, all other elements are shifted one position closer to the beginning.

Python 的 List 是一個 dynamic array , 所以當你移走前面的一個 elemeent , 他必須把後面的 elements 通通往前移。(這會導致耗時,所以耗時的結果: pop() < pop(0))

Big-O’s for different operations with list:

Screen-Shot-2019-07-09-at-12.32.02-AM

Deque

  • Deque is a double-ended queue that supports adding and removing elements from either end in O(1) time.

Deque 是一個非常適合拿來 adding and removing 的資料結構。 (因為他花的時間都屬於 O(1) 層級)

>>> from collections import deque
>>> de = deque([6, 6, 6])
>>> de.appendleft(42)
>>> de
deque([42, 6, 6, 6])
>>> de.popleft()
42
>>> de
deque([6, 6, 6])

Big-O’s for deque and list:

Screen-Shot-2019-07-09-at-12.36.19-AM

Tuple

  • Tuple is basically the same thing as list but with one difference. You can’t add or remove elements from tuples after initialization. It’s immutable data structure.

Truple 是一個初始化之後就不能變動的元素。(他跟 List 很像但只是不能新增刪減元素)

>>> t = ('Noname', 42, True)
# slice tuple
>>> t[:2]
('Noname', 42)
# merge tuples (creates new tuple)
>>> t + (66.6,)
('Noname', 42, True, 66.6)
>>> t
('Noname', 42, True)
  • Tuples are better to use when you need to store a fixed number of elements. (ex: 座標)

Dict

  • The difference between dictionary and list is that you access elements in dictionary by key, not by index.

Dict 是一個 key-value 存放的形式。

Big-O’s for different operations with list:

Screen-Shot-2019-07-09-at-12.52.39-AM

  • The key of a dictionary can be any hashable object.
  • A hashable object is any object that doesn’t change during its lifetime. For example string or number. Not hashable object is obviously any object that changes. For example list. You can add or remove elements from the list.

在Python的世界裡一切的東西都是 Object,每個 Objec t各包含一個 idendity,type和value。

  • identity: 可理解為object的記憶體位址
  • type: Object的型態
  • value: Object的值

我們把 value 可變動的對象稱為 mutable object,把值不可變動的對象稱為immutable(hashable) object。

一個簡單的驗證方法: If a == b is True then hash(a) == hash(b) should also be True.

Set

  • Set stores only unique elements and elements in set are unordered.
  • Python has two types of sets: regular set and frozenset. The only difference between regular set and frozenset is that you can’t add new elements to frozenset after its initialization.

Set 只能存放 unique elements (所有重複的elements都會被消除) , 然後是沒有排序的狀態。

在Python中有兩種形式的 Set,regular set 和 frozenset,那差別只在於 frozenset 在初始化之後便不能新增新的值。

>>> letters = {'a', 'b', 'c'}
>>> 'c' in letters
True
>>> letters.add('d')
>>> letters
{'c', 'b', 'd', 'a'}
>>> letters = frozenset({'a', 'b', 'c'})
>>> letters.add('d')
AttributeError: 'frozenset' object has no attribute 'add'

Reference