
__Python Variables, Data Structures, and Control Logic__

source: https://github.com/zhiyzuo/python-tutorial/blob/master/1-Variables-Data_Structures-Control_Logic.ipynb

@author: Zhiya Zuo

slightly modified and extended by Jens Dittrich for DSAI...

### Code vs Comment

In [1]:
# this is a comment
print('hello world') # this is also a comment starting from the "#" symbol: it is ignored by the Python interpreter

hello world


We write one command per line:

In [2]:
print('1')
print('2')

1
2


and not:

In [3]:
print('1') print('2')

SyntaxError: invalid syntax (<ipython-input-3-6c4a92eb80b4>, line 1)

unless you separate them by a semicolon:

In [5]:
print('1'); print('2')

1
2


*recommendation:* use only one statment per line (increases readability)

### Variables

Vairables can be considered __containers__. You can put anything inside a container, __without specifying the size or type__, which would be needed in Java or C. Note that Python is case-sensitive. Be careful when using characters in different cases.

When assigning values, we put the variable to be assigned to on the left hand side (LHS), while the value to plug in on the RHS. LHS and RHS are connected by an equal sign (`=`), meaning assignment.

In [6]:
x = 3 # integer
y = 3. # floating point number
z = "Hello" # strings
# another string, stored in a variable capital z.
Z = "Wonderful!"
print(x, type(x))
print(y, type(y))
print(z, type(z))
print(Z, type(Z))

3 <class 'int'>
3.0 <class 'float'>
Hello <class 'str'>
Wonderful! <class 'str'>


You can do operations on numeric values as well as strings.

In [7]:
sum_ = x + y # int + float = float
print(sum_)

6.0


In [8]:
v = "World!"
sum_string = z + " " + v # concatenate strings
print(sum_string)

Hello World!


Print with formating with `%`

In [9]:
# %f for floating point number, <.x> specifies x decimal places (Nachkommastellen)
print("The sum of x and y is %.2f"%sum_) 

The sum of x and y is 6.00


In [10]:
# %s for string
print("The string `sum_string` is '%s'"%sum_string)

The string `sum_string` is 'Hello World!'


#### Naming conventions

There are two commonly used nameing conventions in programming:

1. __camelCase__
2. __snake_case__ or __lower_case_with_underscore__

All variable (function and class) names must start with a letter or underscore (\_). You can include numbers.

In [11]:
myStringHere = 'my string'
myStringHere

'my string'

In [12]:
x = 3 # valid
x_3 = "xyz" # valid

In [13]:
3_x = "456" # invalid. Numbers cannot be in the first position.

SyntaxError: invalid token (<ipython-input-13-520aa7218b05>, line 1)

You can choose either camel case or snake case. Always make sure you use one convention consistenly across one notebook/project.

See more here:

[1] https://www.python.org/dev/peps/pep-0008/#descriptive-naming-styles

[2] https://en.wikipedia.org/wiki/Naming_convention_(programming)

#### Some notes on Strings

To initialize a string variable, you can use either double or single quotes.

In [15]:
dsai = "Data Science and Artificial Intelligence"
dsai

'Data Science and Artificial Intelligence'

You can think of strings as a sequence of characters (or a __list__ of characters, see the next section). In this case, indices and bracket notations can be used to access specific ranges of characters.

In [16]:
mySubstring = dsai[17:20] # [start, end), end is exclusive; Python starts with 0 and NOT 1
mySubstring

'Art'

In [17]:
lastLetter = dsai[-1] # -1 means the last element
lastLetter

'e'

---

### Simple Data Structures

In this section, we go over some common [primitive](https://www.datacamp.com/community/tutorials/data-structures-python#adt) data types in Python. While the word _primitive_ looks obscure, we can think of it as the most basic data type that cannot be further decomposed into simpler ones (kind of...).

I categorize them into several subsections based on the values they represent.

#### Numbers

For numbers w/o fractional parts, we say they are ___integer___. In Python, they are called `int`

In [18]:
x = 3
type(x)

int

For numbers w/ fractional parts, they are floating point numbers. They are named `float` in Python.

In [19]:
y = 3.0
type(y)

float

We can apply arithmetic to these numbers. However, one thing we need to be careful about is ___type conversion___. See the example below.

In [20]:
z = 2 * x
type(z)

int

In [21]:
z = y + x
type(z)

float

#### Text/Characters/Strings

In Python, we use `str` type for storing letters, words, and any other characters, as mentioned previously.

In [22]:
my_word = "see you"
type(my_word)

str

Unlike numbers, `str` is an iterable object, meaning that we can iterate through each individual character:

In [23]:
print(my_word[0])
print(my_word[2:6])

s
e yo


We can also use `+` to _concatenate_ different strings 

In [24]:
my_word + ' tomorrow'

'see you tomorrow'

#### Boolean

Boolean type comes in handy when we need to check conditions. For example:

In [25]:
my_error = 1.6
compare_result = my_error < 0.1
compare_result, type(compare_result), 42, 43, "izg"

(False, bool, 42, 43, 'izg')

There are two and only two valid Boolean values: `True` and `False`. We can also think of them as `1` and `0`, respectively.

In [26]:
my_error > 0

True

When using Boolean values for arithmetic operations, they will be converted to `1/0` automatically.

In [27]:
(my_error>0) + 2

3

#### Type Conversion

Since variables in Python are dynamically typed, we need to be careful about type conversion.

When two variables share the same data type, there is not much to be worried about:

In [28]:
s1 = "no problem. "
s2 = "talk to you later"
s1 + s2

'no problem. talk to you later'

But be careful when we are mixing variables up:

In [29]:
a = 3 # recall that this is an ____?
b = 2.7 # how about this?
c = a + b # what is the type of `c`?

To make things work between string and numbers, we can explicitly convert numbers into `str`:

In [30]:
s1 + 3

TypeError: can only concatenate str (not "int") to str

In [41]:
s1 + str(3)

'no problem. 3'

---

### Data Structures

In this section, we discuss some ___non-primitive___ data structures in Python.

We can think of ___non-primitive___ types as those who can store ___primitive___ data

#### List

a list is an ordered collection of items that may contain duplicates

Initialize a list with brackets. You can store anything in a list, even if the individual elements have different types. A list may have duplicates.
- note that we use [___string formatting___](https://pyformat.info/) to display strings
- `%i` is a placeholder for `int`
- `%s` for `str`

In [42]:
# define a list:
# variable = [el0, el1, ..., eln ]
a_list = [42, 9, 53, 7, 9] # commas to seperate elements that are part of a list
print("Length of a_list is: %i"%(len(a_list)))
print("The 3rd element of a_list is: %s" %(a_list[2])) # Remember Python starts with 0
print("The last element of a_list is: %s" %(a_list[-1])) # -1 means the end
print("The sum of a_list is: %i"%(sum(a_list)))

Length of a_list is: 5
The 3rd element of a_list is: 53
The last element of a_list is: 9
The sum of a_list is: 120


We can put different elements of different types into a list:

In [43]:
b_list = [20, True, "good", "good"] 
b_list

[20, True, 'good', 'good']

Modify and Update a list using __pop__, __remove__, __append__, __extend__

In [44]:
a_list = [42, 9, 53, 7, 8, 2, 3, 1]
print(a_list)
print("Pop %i out of a_list"%a_list.pop(1)) # pop (ie.e. remove) the value at an index position
print(a_list)
print("Pop %i out of a_list"%a_list.pop(2)) # pop (ie.e. remove) the value at an index position
print(a_list)

[42, 9, 53, 7, 8, 2, 3, 1]
Pop 9 out of a_list
[42, 53, 7, 8, 2, 3, 1]
Pop 7 out of a_list
[42, 53, 8, 2, 3, 1]


In [45]:
b_list = [20, True, "good", "good"] 
print("Remove the string good from b_list:")
b_list.remove("good") # remove first occurence(!) of a specific value
print(b_list)

Remove the string good from b_list:
[20, True, 'good']


In [46]:
print("Remove the string good from b_list:")
b_list.remove("good") # remove first occurence(!) of a specific value
print(b_list)

Remove the string good from b_list:
[20, True]


In [47]:
a_list.append(10) # append integer 10 to the end of the list
print("After appending a new value, a_list is now: %s"%(str(a_list)))

After appending a new value, a_list is now: [42, 53, 8, 2, 3, 1, 10]


merge `a_list` and `b_list`, i.e. append all elements of `b_list` to the end of `a_list`: 

In [48]:
a_list.extend(b_list)
print("Merging a_list and b_list: %s"%(str(a_list)))

Merging a_list and b_list: [42, 53, 8, 2, 3, 1, 10, 20, True]


We can also use `+` as a shorthand to concatenate two lists

In [49]:
a_list + b_list 

[42, 53, 8, 2, 3, 1, 10, 20, True, 20, True]

#### Tuple (a special case of a list whose elements cannot be changed)

Initialize a tuple with parenthesis. The major difference between list and tuple is that you can alter list but not tuple.

In [50]:
a_tuple = (1, 2, 3, 10)
print(a_tuple)
print("First element of a_tuple: %i"%a_tuple[0])

(1, 2, 3, 10)
First element of a_tuple: 1


You cannot change the values of a_tuple

In [51]:
a_tuple[0] = 5

TypeError: 'tuple' object does not support item assignment

In order to create a single value tuple, you need to add a ','

In [53]:
a_tuple = (1,) # this would create an int type
print(type(a_tuple))
b_tuple = (1,) # this would create a tuple type, take note of the comma.
print(type(b_tuple))

<class 'tuple'>
<class 'tuple'>


#### Set

a set is an unordered, duplicate-free collection of items

In [54]:
a_list = [42, 9, 53, 7, 9] 
a_set = {42, 9, 53, 7, 9}

a_list, a_set

([42, 9, 53, 7, 9], {7, 9, 42, 53})

In [55]:
type(a_list), type(a_set)

(list, set)

In [56]:
# you can convert a list to a set:
conv = set(a_list)
conv, type(conv)

({7, 9, 42, 53}, set)

In [57]:
# and vice versa:
# you can convert a list to a set:
conv = list(a_set)
conv, type(conv)

([9, 42, 53, 7], list)

#### Dictionary: key-value pairs

a dictionary (aka map) is an unordered collection of keys that are mapped to values, the values mapped to may contain duplicates

Initialize a dictionary using curly brackets `{}`

In [58]:
d = {} # empty dictionary
d[1] = "foo" # add a key-value by using bracket (key). You can put anything in key/value
d[7] = "bar"
d[3] = "blubb"
print(d)

{1: 'foo', 7: 'bar', 3: 'blubb'}


In [59]:
d['KI'] = 'AI'

In [60]:
d['AI'] 

KeyError: 'AI'

In [None]:
#notice that the type of {} is dict and not set (this is for historic reasons)
type({})

In [None]:
list(d.keys())

In [None]:
list(d.values())

---

### Control Logics

In the following examples, we show examples of comparison, `if-else` loop, `for` loop, and `while` loop.

#### Comparison

Python syntax for comparison is the same as our hand-written convention: 

1. Larger (or equal): `>` (`>=`)
2. Smaller (or equal): `<` (`<=`)
3. Equal to: `==` (__Notie here that there are double equal signs__)
4. Not equal to: `!=`

In [61]:
# the following is a condition which must return a boolean value
3 == 5 

False

In [62]:
a == 767

False

In [63]:
a

3

In [64]:
72 >= 2

True

IMPORTANT: It is worth noting that comparisons between floating point numbers are tricky.

In [65]:
print(2.2 * 3.0)
2.2 * 3.0 == 6.6

6.6000000000000005


False

In [66]:
2.2 * 3.0

6.6000000000000005

In [67]:
3.3 * 2 == 6.6

True

see https://docs.python.org/2/tutorial/floatingpoint.html for the explanation, wou will get back to this in the Programming 2 lecture

Therefore, be really careful when you have to do such comparison.

#### If-Else

In [68]:
sum = 30

In [69]:
if sum > 5:
    print('sum_ is above 5') # this statement MUST have a tab in front
    print('dsfsd')

sum_ is above 5
dsfsd


In [70]:
if sum > 5:
    print('sum_ is above 5') # this statement MUST have a tab in front
    if sum > 15:
        print('sum_ is above 15')

sum_ is above 5
sum_ is above 15


In python TAB is used to symbolize blocks. In Java and C++ blocks are marked using curly brackets {}.

```Java
if (sum>5) {
    System.out.println("sum_ is above 5");
}
```

We do not have this in Python!

In [71]:
sum = 1
if sum == 0:
    print("sum is 0") 
elif sum < 0:
    print("sum is less than 0")
else:
    print("sum is above 0 and its value is " + str(sum)) # Cast sum into string type.

sum is above 0 and its value is 1


Comparing to check if strings are similar

In [72]:
store_name = 'Walmart'

In [73]:
store_name == 'Walmart'

True

In [74]:
store_name == 'walmart'

False

In [75]:
# check whether substring contained in a string:
if 'alm' in store_name:
    print("yep.")
else:
    print("nope.")

yep.


#### For loop: Iterating through a sequence

In [76]:
for letter in store_name:
    print(letter)

W
a
l
m
a
r
t


`range()` is a function to create interger sequences:

In [77]:
# create an int range [0;7[
list(range(7))

[0, 1, 2, 3, 4, 5, 6]

In [78]:
# create an int range [4;7[
list(range(4,7))

[4, 5, 6]

In [79]:
# create an int range [7;25[, however starting from 7 add integers in steps of 3 only
list(range(7,25,3))

[7, 10, 13, 16, 19, 22]

In [80]:
# range() is very useful in combination with for-loops:
for index in range(7,21,3): # length of a sequence
    print(index)

7
10
13
16
19


In [81]:
# range() is very useful in combination with for-loops:
for index in range(len(store_name)): # length of a sequence
    print("The %ith letter in store_name is: %s"%(index, store_name[index]))

The 0th letter in store_name is: W
The 1th letter in store_name is: a
The 2th letter in store_name is: l
The 3th letter in store_name is: m
The 4th letter in store_name is: a
The 5th letter in store_name is: r
The 6th letter in store_name is: t


#### While loop: Keep doing until condition no longer holds.

Use `for` when you know __the exact number of iterations__; use `while` when you __do not (e.g., checking convergence)__.

In [82]:
x = 2

In [83]:
x

2

In [84]:
while x < 10:
    print(x)
    x = x + (x-1)
    # x += x-1

2
3
5
9


#### Notes on `break` and `continue`

`break` means get out of the loop immediately. Any code after the `break` will NOT be executed.

In [85]:
store_name = 'Walmart'

In [86]:
index = 0
while True:
    print(store_name[index])
    index += 1 # a += b means a = a + b
    if store_name[index] == "a":
        print("-> End at a, position: ", index)
        break # instead of setting flag to False, we can directly break out of the loop
        print("Hello!") # This will NOT be run

W
-> End at a, position:  1


`continue` means get to the next iteration of loop. It will __break__ the current iteration and __continue__ to the next.

In [88]:
for letter in store_name:
    if letter == "a":
        continue # Not printing 'a'
    else:
        print(letter)

W
l
m
r
t
