about
Bits Data Python
11/25/2023
0
Bits: Python Data Types
types, initialization, construction, assignment
Synopsis:
Most of a language's syntax and semantics derives directly from its type system design. This is true of all of the languages discussed in these Bits: C++, Rust, C#, Python, and JavaScript. Python has a relatively simple type system, entirely composed of dynamic types. Dynamic types are checked at run-time. Variables may be bound to any language defined type at run-time and may be reset to another type at any time. This page demonstrates simple uses of the most important Python types. The purpose is to quickly acquire some familiarity with types and their uses.- All instances of any of the Python types live in its managed heap.
- Construction, assignment, and pass-by-value copy references, not instances stored in the Python managed heap. That means that any of these operations results in two handles pointing to the same instance in the managed heap.
- Python has a deepcopy operation, defined in the copy module that implements a clone operation, e.g., the original and clone are independent instances and do not share any data.
- Python does not support move operations since its garbage collector owns all type instances. There is no need to use moves to avoid the expense of possibly large copies, since the language does not directly support making copies of reference types.
- Python is memory safe because all memory management operations are executed by its execution engine, not by user's code. That comes with performance penalties for throughput and latency that are significantly worse than that of native code languages like C++ and Rust.
- Here we begin to see significant differences between the languages, especially when comparing statically typed languages like C++, Rust, and C#, with dynamically typed languages like Python and JavaScript.
Python Types Details
Table 1.0 Python Types
Type | Comments | Example |
---|---|---|
-- Scalar reference types ---- | ||
values true and false | ||
|
|
|
size = 4 bytes, values have finite precision, and may have approximate values | f = -3.33e5 |
|
real and imaginary floating point parts | ||
-- Aggregate reference types ---- | ||
immutable binary sequence of bytes, 1 byte ~ 8 bits |
b0 = byts[0] |
|
fixed size mutable binary sequence of bytes |
ba[0] = 1 |
|
Tuple is collection of heterogeneous types accessed by position |
third = tup[2]; |
|
sequence of natural numbers: |
# result: 1, 3, 5, 7, 9 |
|
Immutable collection of Unicode characters. Several str methods return a new string object. |
s0 = s[0] # value = "h" |
|
collection of items of any type. It is common practice to use a single type |
l0 = l[0] |
|
An associative collection of key-value pairs. Keys must all be of the same type, but values can be any arbitrary type. |
d["three"] = 3  # inserts {"three":3} into d valz = d["zero"] # value is 0 |
|
Many additional types defined in Python packages | Collections, Random, TKinter, Requests, Numpy, Matplotlib, Flask, ... | |
-- User-defined Types -- | ||
User-defined types | Based on classes, these will be discussed in the next Bit. |
Python Type System Attributes
Table 2. Python Type System Attributes
Dynamic typing | All Python types are dynamic. Dynamic types are evaluated at run-time and may be bound to any type of data. All data is held in the Python managed heap. |
Type inference | Compiler infers all types from the declaring definition. There is a type linting facility which requires an external static analysis program. Type linting has no effect on run-time behavior. |
Duck typing | All expressions are checked at run-time. Exceptions are thrown at load-time if there are any failures of syntax. An exception is thrown at run-time if an expression cannot be evaluated. |
Generics | There are no generics in Python code since any type can be passed to a function or added to a collection. The Python interpreter checks the validity of operations on data and throws exceptions if expressions are invalid. |
1.0 Initialization
1.1 Scalar Types
#-------------------------------------------
All code used for output has been elided
#-------------------------------------------
# NoneType
n = None
# boolean
bl = True
bl2 = bool()
# integer
i = 42
i2 = -3000000000000000
i3 = int()
# float
f = 3.1415927
f2 = 3.33333e5
f3 = 3.33333e55
f4 = float()
# complex
c = 1.5 + 3.0j
r = c.real
i = c.imag
c1 = complex()
Output
--------------------------------------
Initialize Scalars
--------------------------------------
-- n = None --
n <class 'NoneType'>
value: None , size: 16
-- bl = true --
bl <class 'bool'>
value: True , size: 28
-- bl2 = bool() --
bl2 <class 'bool'>
value: False , size: 24
-- i = 42 --
i <class 'int'>
value: 42 , size: 28
-- i = -3000000000000000 --
i2 <class 'int'>
value: -3000000000000000 , size: 32
-- i = int() --
i3 <class 'int'>
value: 0 , size: 24
-- f = 3.1415927 --
f <class 'float'>
value: 3.1415927 , size: 24
-- f2 = 333333.0 --
f2 <class 'float'>
value: 333333.0 , size: 24
-- f3 = 3.33333e+55 --
f3 <class 'float'>
value: 3.33333e+55 , size: 24
-- f4 = float() --
f4 <class 'float'>
value: 0.0 , size: 24
-- c = 1.5 + 3.0j --
c <class 'complex'>
value: 1.5+3j , size: 32
c.real <class 'float'>
value: 1.5 , size: 24
c.imag <class 'float'>
value: 3.0 , size: 24
-- c1 = complex() --
c1 <class 'complex'>
value: 0j , size: 32
1.2 Aggregate Types
# bytes
byts = b"python"
byts1 = byts[0]
# byts[1] = 2 # error: bytes are immutable
bytstr = byts.decode('UTF-8')
# bytearray
ba = bytearray(4)
ba[0] = 1
ba[1] = 0xee
ba[2] = 0
ba[3] = 1
# str
s = "hello python"
# range
r = range(6)
for n in r :
print(n, " ", end="")
print()
# 0 1 2 3 4 5
r2 = range(-1, 10, 2)
for n in r2 :
print(n, " ", end="")
print(nl)
# -1 1 3 5 7 9
# tuple
t = (42, 3.1415927, 'z')
telem2 = t[1]
# list
l = [1, 2, 3, 2, 1]
l.append("weird")
l0 = l[0]
# dict
d = { "zero": 0, "one": 1, "two":2 }
d["three"] = 3 # insert new element
d2 = d["two"] # access value at key = "two"
Output
----------------------------------------
Initialize Aggregates
----------------------------------------
-- byts = b"python" --
byts <class 'bytes'>
value: b'python' , size: 39
byts[0] <class 'int'>
value: 112 , size: 28
byts.decode(UTF-8) <class 'str'>
value: python , size: 55
byts.decode(UTF-8)[0] <class 'str'>
value: p , size: 50
-- ba = bytearray(4) --
ba <class 'bytearray'>
value: bytearray(b'\x01\xee\x00\x01') , size: 61
-- s = "hello python" --
s <class 'str'>
value: hello python , size: 61
empty str <class 'str'>
value: , size: 49
1 char str <class 'str'>
value: h , size: 50
2 char str <class 'str'>
value: he , size: 51
s[0] <class 'str'>
value: h , size: 50
-- r = range(6) --
range(6) <class 'range'>
value: range(0, 6) , size: 48
0 1 2 3 4 5
range(-1, 10, 2) <class 'range'>
value: range(-1, 10, 2) , size: 48
-1 1 3 5 7 9
-- t = (42, 3.1415927, 'z') --
t <class 'tuple'>
value: (42, 3.1415927, 'z') , size: 64
t[1] <class 'float'>
value: 3.1415927 , size: 24
-- l = [1, 2, 3, 2, 1] --
l <class 'list'>
value: [1, 2, 3, 2, 1] , size: 104
-- l.append("weird") --
l <class 'list'>
value: [1, 2, 3, 2, 1, 'weird'] , size: 104
-- l0 = l[0] --
l0 <class 'int'>
value: 1 , size: 28
-- d = { "zero": 0, "one": 1, "two":2 } --
d <class 'dict'>
value: {'zero': 0, 'one': 1, 'two': 2},
size: 232
-- d["three"] = 3 --
d <class 'dict'>
value:
{'zero': 0, 'one': 1, 'two': 2, 'three': 3},
size: 232
-- d2 = d["two"] --
d2 <class 'int'>
value: 2 , size: 28
1.3 Copy and Modify Scalar
/*-- Copy and Modify scalar --*/
t1 = 42 # t1 is handle pointing to value 42 in heap
t2 = t1 # copy handle to existing heap-based value
t1 = 0 # reset handle to new heap object
Output
----------------------------------------
Copy and Modify scalar
----------------------------------------
-- t1 = 42 --
t1 <class 'int'>
value: 42 , size: 28
-- t2 = t1 --
t2 <class 'int'>
value: 42 , size: 28
t1: address: 0x1dc78420610
t2: address: 0x1dc78420610
----------------------------------------
After copy construction t1 and t2 have
same address, e.g., two handles pointing
to the same heap integer object.
----------------------------------------
-- t1 = 0 # new object --
t1 <class 'int'>
value: 0 , size: 24
t1: address: 0x1dc784200d0
t2 <class 'int'>
value: 42 , size: 28
t2: address: 0x1dc78420610
----------------------------------------
After setting new value for t1,
t1 and t2 have unique addresses,
e.g., two handles pointing to different
heap integer objects.
----------------------------------------
1.4 Copy and Modify Aggregates
/*-- copy and modify aggregates --*/
t3 = [1, 2, 3, 2, 1] # handle pointing to list in heap
t4 = t3 # copy construction
t3.append(0) # changes value, doesn't create new object
t5 = "Hello Python" # handle pointing to string in heap
t6 = t5 # copy construction
t6 = t6.replace("P", "p") # copy on write creates new object
Output
----------------------------------------
copy and modify aggregate
----------------------------------------
-- t3 = [1, 2, 3, 2, 1] --
t3 <class 'list'>
value: [1, 2, 3, 2, 1] , size: 104
-- t4 = t3 --
t4 <class 'list'>
value: [1, 2, 3, 2, 1] , size: 104
t3: address: 0x248ce392280
t4: address: 0x248ce392280
----------------------------------------
After copy construction t3 and t4 have
same address, e.g., two handles pointing
to the same heap integer object.
----------------------------------------
-- t3.append(0) # modify object --
t3 <class 'list'>
value: [1, 2, 3, 2, 1, 0] , size: 104
t3: address: 0x248ce392280
t4 <class 'list'>
value: [1, 2, 3, 2, 1, 0] , size: 104
t4: address: 0x248ce392280
----------------------------------------
After appending new value for t3,
t3 and t4 still have same value and
address, e.g., two handles pointing
to same heap integer object. No copy
on write.
----------------------------------------
-- t5 = "Hello Python" --
t5 <class 'str'>
value: Hello Python , size: 61
-- t6 = t5 # copy construction --
t6 <class 'str'>
value: Hello Python , size: 61
t5: address: 0x248ce3ac5f0
t6: address: 0x248ce3ac5f0
----------------------------------------
After copy construction t5 and t6 have
same address, e.g., two handles pointing
to the same heap string object.
----------------------------------------
-- t6 = t6.replace("P", "p") # copy on write --
t6 <class 'str'>
value: Hello python , size: 61
t5 <class 'str'>
value: Hello Python , size: 61
t6: address: 0x248ce3ada70
t5: address: 0x248ce3ac5f0
----------------------------------------
After modifying value for t6,
t5 and t6 have different values
and addresses, e.g., string has
copy on write.
----------------------------------------
1.5 functions
Function Code
/*-- Analysis and Display Functions --*/
# Python/Py_Data::Py_DataAnalysis.py
#
import sys
nl = "\n"
# displays type, value, and size of apex object
# - does not account for sizes of decendent objects
def showType(t, nm, suffix = "") :
print(nm, type(t))
print(
"value: ", t, ', size: ', sys.getsizeof(t), suffix
)
# id is heap address
def showIdent(t, nm, suffix = "") :
print(nm, ": ", hex(id(t)), suffix, sep='')
# evaluates heap address
def showAddress(t, nm, suffix = "") :
print(nm, "address: ", hex(id(t)), suffix, sep='')
# show text encased in upper and lower lines
def showNote(text, suffix = "") :
print(
"----------------------------------------"
)
print(" ", text)
print(
"----------------------------------------", suffix
)
# show text enclosed in -- delimiters on same line
def showOp(text, suffix = ""):
print("--", text, "--", suffix)
2.0 VS Code View
3.0 References
Reference | Description |
---|---|
Python Data Types - w3schools | Interactive examples |
Python Data Types - docs.python.org | Detailed description of Python Types |
Character Encodings | Detailed summary of Python strs and encodings. |
Python Libraries | Summaries of 14 popular libraries and modules. |