@@ -17,6 +17,8 @@ concrete realisations of a mathematical idea.
17
17
An *abstract data type * is a purely mathematical :term: `type `,
18
18
defined independently of its concrete realisation as code.
19
19
20
+ Abstract data types enable the programmer to reason about algorithms and their
21
+ cost separately from the task of implementing them.
20
22
That said, it will frequently be helpful in understanding abstract
21
23
data types to refer to the ways in which they might be implemented.
22
24
@@ -37,18 +39,24 @@ a :term:`LIFO (last in, first out)`, because the last object added to
37
39
the stack is the first object retrieved (contrast :term: `FIFO <FIFO (first in, first out)> `).
38
40
39
41
Recall that a :term: `type ` is defined by a set of possible values and
40
- a set of operations. A stack is an ordered sequence of objects ( of any
41
- type) with the operations `push ` to add a new object to the sequence,
42
+ a set of operations. The value of stack is an ordered sequence of objects of any
43
+ type. The operations are `push ` to add a new object to the sequence,
42
44
and `pop ` to return the most recently added object, and remove it from
43
- the sequence. It is also common to add an additional operation of
45
+ the sequence. :numref: ` stackdiag ` shows these operations. It is also common to add an additional operation of
44
46
`peek `, which returns the most recently added object without removing
45
47
it from the stack.
46
48
47
49
.. note ::
48
50
49
- Put a diagram illustrating stack operations here.
51
+ The stack operations in the diagram are wrong. The stack is back to front and
52
+ the wrong value is popped.
53
+
54
+ .. _stackdiag :
50
55
51
56
.. blockdiag ::
57
+ :caption: Cartoon of a sequence of stack operations. First 24, 12, 57 are
58
+ pushed, then 57 is popped.
59
+
52
60
53
61
blockdiag stack{
54
62
// setup info
@@ -261,9 +269,9 @@ Algorithmic complexity
261
269
The second reason that understanding abstract data types is important
262
270
is that a good implementation of a well designed abstract data type
263
271
will have well-defined performance characteristics. In particular, the
264
- optimal algorithmic complexity, expressed in big 'O' notation, of
272
+ optimal algorithmic complexity, expressed in big :math: `O` notation, of
265
273
operations on abstract data types will be known. Recall the definition
266
- of big 'O' :
274
+ of big :math: `O` :
267
275
268
276
.. _bigO :
269
277
@@ -327,15 +335,18 @@ data structure.
327
335
def peek(self):
328
336
return self.data[-1]
329
337
330
- :numref: `bigO ` is a particular case of the big `O ` notation, which you
331
- may already have seen in numerical analysis. However, there the limit
332
- is taken as the independent variable approaches 0. This difference of
333
- context between computer science and numerical analysis is sometimes
334
- confusing, particularly since both disciplines conventionally leave
335
- out the limit. It's worth keeping in mind that the difference, because
336
- a numerical algorithm with :math: `O(h^4 )` error is really rather good
337
- since `h ` is small, but an algorithm with :math: `O(n^4 )` cost is very
338
- expensive indeed!
338
+
339
+ .. note ::
340
+
341
+ :numref: `Definition %s <bigO >` is a particular case of the big `O ` notation, which you may
342
+ already have seen in numerical analysis. The distinction is that in
343
+ analysing algorithmic complexity, the limit is taken as :math: `n` approaches
344
+ infinity, while in numerical analysis the independent variable approaches 0.
345
+ This difference between two closely related fields is often confusing,
346
+ particularly since both disciplines conventionally leave out the limit. It's
347
+ worth keeping in mind that the difference, because a numerical algorithm
348
+ with :math: `O(h^4 )` error is really rather good since `h ` is small, but an
349
+ algorithm with :math: `O(n^4 )` cost is very expensive indeed!
339
350
340
351
Amortised complexity and worst case complexity
341
352
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -359,14 +370,18 @@ full does a further append operation cause Python to allocate more
359
370
memory. The amount of memory allocated is approximately proportional
360
371
to the current length of the list. That is, if the current list length
361
372
is :math: `n` then the new memory allocation will be of size
362
- approximately :math: `kn` for some :math: `k>1 `.
363
-
364
- .. note ::
365
-
366
- Need diagrams of how a dynamic array works here.
373
+ approximately :math: `kn` for some :math: `k>1 `. This concrete data structure is
374
+ called a :term: `dynamic array `. :numref: `dynamicarray ` illustrates its operation.
367
375
376
+ .. _dynamicarray :
377
+
368
378
.. graphviz ::
369
- :align: center
379
+ :caption: A dynamic array implementation of a :class: `list `. The existing
380
+ memory buffer is full, so when 11 is appended to the list, a larger
381
+ buffer is created and the whole list is copied into it. When 13 is
382
+ subsequently appended to the list, there is still space in the buffer so
383
+ it is not necessary to copy the whole list.
384
+ :align: center
370
385
371
386
digraph dl {
372
387
bgcolor="#ffffff00" # RGBA (with alpha)
@@ -384,7 +399,7 @@ approximately :math:`kn` for some :math:`k>1`.
384
399
style="ellipse, dashed";
385
400
bgcolor="#CD5C5C";
386
401
"node0" [
387
- label = "<f0> 2 | 3| 5| 7 |e<f1>"
402
+ label = "<f0> 2 | 3| 5| 7 |e <f1>"
388
403
shape = "record"
389
404
];
390
405
}
@@ -394,27 +409,27 @@ approximately :math:`kn` for some :math:`k>1`.
394
409
bgcolor="#2E8B57";
395
410
396
411
"node1" [
397
- label = "<f0> 2 | 3| 5| 7 | <f1>| | | <f2>"
412
+ label = "<f0> 2 | 3| 5| 7 | <f1> 11 | | | <f2>"
398
413
shape = "record"
399
414
400
415
];
416
+ }
417
+ subgraph cluster_4 {
418
+ style="ellipse, dashed";
419
+ bgcolor="#2E8B57";
401
420
402
421
"node3" [
403
422
label = "<f0> 2 | 3| 5| 7| <f1> 11| <f2> 13| | <f3>"
404
423
shape = "record"
405
424
];
406
425
}
407
-
408
- "node0":f0 -> "node1":f0 [
409
- id = 0
410
- ];
411
426
412
- "node1 ":f0 -> "node3":f1 [
427
+ "node0 ":f0 -> "node1":f0 [
413
428
id = 2
414
429
label = "append 11"
415
430
];
416
431
417
- "node1":f0 -> "node3":f2 [
432
+ "node1":f0 -> "node3":f0 [
418
433
id = 2
419
434
label = "append 13"
420
435
];
@@ -451,43 +466,65 @@ contrast, the occasional list append operation is an example of the
451
466
list has an amortised time complexity of :math: `O(1 )` but a worst-case
452
467
time complexity of :math: `O(n)`.
453
468
454
- .. note ::
469
+ We can use Python's :term: `introspection ` capabilities to illustrate how the
470
+ dynamic allocation of space for a list works as the list is appended. The
471
+ :func: `sys.getsizeof ` function returns the amount of computer memory that an
472
+ object consumes. The function in :numref: `byte_size ` uses this to diagnose the memory
473
+ consumption of progressively longer lists, and :numref: `byte_size ` demonstrates
474
+ this.
455
475
456
- Not sure if you want this? Shows the byte usage of the array. If so,
457
- I will update the IPython call numbers.
458
-
459
- .. code-block :: python
476
+ .. _byte_size :
460
477
461
- import sys
478
+ .. code-block :: python3
479
+ :caption: Code to progressively lengthen a :class:`list` and observe the
480
+ impact on its memory consumption. This function is available as
481
+ :func:`example_code.linked_list.byte_size`.
482
+ :linenos:
483
+
484
+ import sys
485
+
486
+ def byte_size(n):
487
+ """Print the size in bytes of lists up to length n."""
488
+ data = []
489
+ for i in range(n):
490
+ a = len(data)
491
+ b = sys.getsizeof(data)
492
+ print(f"Length:{a}; Size in bytes:{b}")
493
+ data.append(i)
494
+
495
+ .. _byte_size_demo :
462
496
463
- def byteSize (n ):
464
- data = []
465
- for i in range (n):
466
- a = len (data)
467
- b = sys.getsizeof(data)
468
- print (f " Length: { a} ; Size of bytes: { b} " )
469
- data.append(i)
470
-
471
497
.. code-block :: ipython3
472
-
473
- In [1]: n = 10
474
- In [2]: byteSize(n)
475
- Length:0; Size of bytes:72
476
- Length:1; Size of bytes:104
477
- Length:2; Size of bytes:104
478
- Length:3; Size of bytes:104
479
- Length:4; Size of bytes:104
480
- Length:5; Size of bytes:136
481
- Length:6; Size of bytes:136
482
- Length:7; Size of bytes:136
483
- Length:8; Size of bytes:136
484
- Length:9; Size of bytes:200
485
-
486
- Some more abstract data types
487
- -----------------------------
488
-
489
- Queue and deque
490
- ~~~~~~~~~~~~~~~
498
+ :caption: The memory consumption of lists of length 0 to 19. We can infer
499
+ that the list is reallocated at lengths 1, 5, 9, and 17.
500
+
501
+ In [1]: from example_code.linked_list import byte_size
502
+
503
+ In [2]: byte_size(20)
504
+ Length:0; Size in bytes:56
505
+ Length:1; Size in bytes:88
506
+ Length:2; Size in bytes:88
507
+ Length:3; Size in bytes:88
508
+ Length:4; Size in bytes:88
509
+ Length:5; Size in bytes:120
510
+ Length:6; Size in bytes:120
511
+ Length:7; Size in bytes:120
512
+ Length:8; Size in bytes:120
513
+ Length:9; Size in bytes:184
514
+ Length:10; Size in bytes:184
515
+ Length:11; Size in bytes:184
516
+ Length:12; Size in bytes:184
517
+ Length:13; Size in bytes:184
518
+ Length:14; Size in bytes:184
519
+ Length:15; Size in bytes:184
520
+ Length:16; Size in bytes:184
521
+ Length:17; Size in bytes:256
522
+ Length:18; Size in bytes:256
523
+ Length:19; Size in bytes:256
524
+
525
+
526
+ Queues and deques
527
+ -----------------
491
528
492
529
A :term: `queue ` is, like a :term: `stack `, an ordered sequence of
493
530
objects. The difference is that the only accessible item in the
@@ -502,6 +539,41 @@ deque. Python's standard library contains the
502
539
:class: `collections.deque ` class, providing a simple and efficient
503
540
implementation of a deque.
504
541
542
+ Ring buffers
543
+ ~~~~~~~~~~~~
544
+
545
+ How might one go about implementing a deque? A dynamic array allows values to be
546
+ appended with :math: `O(1 )` complexity, but doesn't offer an efficient mechanism
547
+ for prepending values. One might think that the natural solution for this would
548
+ be to create a double-ended dynamic array: a buffer with spare space at each
549
+ end. Unfortunately this is not optimally efficient in the case where the deque
550
+ is used to implement a queue of approximately constant length. In that case,
551
+ values are consistently added at one end of the data structure and removed from
552
+ the other. Even in the case of a double-ended dynamic array, the buffer space at
553
+ the append end of the queue will constantly run out, necessitating an expensive
554
+ copy operation. The solution is to use a dynamic array, but to logically join up
555
+ its ends, so that the first position in the buffer follows on from the last.
556
+ Only in the case where all positions in the buffer are full would the buffer be
557
+ reallocated.
558
+
559
+ .. figure :: images/ring_buffer.*
560
+
561
+ An implementation of a deque in a ring buffer, with queue
562
+ operations illustrating its operation.
563
+
564
+ Objects are added to the end of the
565
+ buffer and removed from its start.
566
+
567
+ At step 7, the contents of the buffer
568
+ wrap around: the queue at this stage contains `D, E, F `.
569
+
570
+ At step 9 there is
571
+ insufficient space in the buffer to append `G `, so new space is allocated
572
+ and the buffer's contents copied to the start of the new buffer.
573
+
574
+
575
+ Some more abstract data types
576
+ -----------------------------
505
577
506
578
Linked lists
507
579
~~~~~~~~~~~~
@@ -747,10 +819,10 @@ to keep track of the iteration.
747
819
def __init__(self, link):
748
820
self.here = link
749
821
750
- def __iter__():
822
+ def __iter__(self ):
751
823
return self
752
824
753
- def __next__(self):
825
+ def __next__(self):
754
826
if self.here:
755
827
next = self.here
756
828
self.here = self.here.next
0 commit comments