Work through Section 3 of the Python Tutorial http://docs.python.org/3/tutorial/ till the end of 3.1.2 Strings.
Read the first chapter of the book Programming Knights till the end of 1.7 Examples of Programs Using the input() statement.
Also, look up the definition of str.find at http://docs.python.org/3/library/stdtypes.html#str.find.
Introduction to Python in the context of implementing a simple search engine.
Based on:
Udacity Course CS101 "Intro to Computer Science"
Build a Search Engine |
---|
Find Data (Weeks 1-3) |
Build an Index (Weeks 4-5) |
Rank Pages (optional) |
print(3) print(1 + 1) print(52 * 3 + 12 * 9) print(52 * 3) + (12 * 9)) print(52 * (3 + 12) * 9) print(365 * 24 * 60 * 60)
# Write Python code that prints out # the number of minutes in 5 weeks.
# the following Python code produces a syntax error print(2 + 2 +)
Expression -> Expression Operator Expression Expression -> Number Operator -> + Operator -> * Number -> 0,1,2... Expression -> (Expression) Print_Statement -> print(Expression)
Which of the following are valid Python expressions that can be produced starting from Expression?
3 ((3) (1*(2*(3*4))) + 3 3 (((7))) 2 + 2 +
# Write Python code to print out # how far light travels in centimeters in one nanosecond. # # speed of light = 299 792 458 meters / second # meter = 100 centimeters # nanosecond = 1.0 / 1000000000
# Write Python code to print out # how far light travels in centimeters in one nanosecond. speed_of_light = 299792458 billionth = 1.0 / 1000000000 meter = 100 print(speed_of_light * meter * billionth)
# Given the variables defined here, write Python # code that prints out the distance, in meters, # that light travels in one processor cycle. speed_of_light = 299792458 cycles_per_second = 2700000000
# = means assignment speed_of_light = 299792458 # 2.7 Ghz cycles_per_second = 2700000000 print(speed_of_light * 1.0 / cycles_per_second) # 2.8 Ghz cycles_per_second = 2800000000 print(speed_of_light * 1.0 / cycles_per_second)
# What is the value of hours after running this code? hours = 9 hours = hours + 1 hours = hours * 2
# What is the value of seconds after running this code? minutes = minutes + 1 seconds = minutes * 60
# Write Python code that defines the variable # age to be your age in years, and then prints # out the number of days you have been alive.
Besides numbers, Python can also manipulate strings, which can be expressed in several ways. They can be enclosed in single quotes or double quotes with the same result. \ can be used to escape quotes:
print('I am a string') print("I prefer double quotes") print("I'm happy I started with a double quote") print('I don\'t mind a single quote') # using a variable hello = "Hello" print(hello)
# Which of the following is a valid string? "Ada" 'Ada" "Ada Ada '"Ada'
# Define a variable, name, and assign to it # a string that is your name
Strings can be concatenated (glued together) with the + operator, and repeated with *
# Define a variable, name, and assign to it # a string that is your name. # Print out the word Hello followed by your name and three !'s name = "Pawel" print("Hello " + name + " !!!") # print out the text "repeat three times" three times text = "repeat three times " print(3 * text)
Strings can be indexed (subscripted), with the first character having index 0. There is no seperate character type; a character is simply a string of size one.
# < string >[< expression >] # 012345679 print('Intro to Python'[0]) # => 'I' # 012345679 print('Intro to Python'[1+1]) # => 't' # 012345679 name = 'Pawel' print(name[1]) # => 'a'
Indices may also be negative numbers and allow us to start counting from the right (the end of the string).
Note that -0 is the same as 0, so negative indices start from -1.
# 0123 print('word'[-1]) # => 'd' # 012345679 print('word'[-2]) # => 'r' # 012345679 name = 'Pawel' print(name[-3]) # => 'w'
# Which of these pairs are two things # with the exact same value? # s is a variable whose value is an arbitrary string print( s[3] , s[1+1+1] ) print( s[0] , (s+s)[0] ) print( s[0]+s[1] , s[0+1] ) print( s[1] , (s+' is OK')[1] ) print( s[-1] , (s+s)[-1] )
In addition to indexing, slicing is also supported. While indexing is used to obtain an individual character, slicing allows you to obtain a substring.
# < string > [< expression >] => one-character string # number print('Pawel'[1]) # => 'a' print('Pawel'[1:3]) # => 'aw'
# start stop # < string > [< expression > : < expression >] # s number number # => string that is a subsequence of # the characters in s # starting from position start and # ending with position stop-1
name = 'Pawel' # 01234 print(word[2:4]) # => 'awe'
word = 'assume' # 0123456 print(word[3]) # => 'u' print(word[4:6]) # => 'me' print(word[4:]) # => 'me' print(word[:2]) # => 'as' print(word[:]) # => 'assume'
# Write Python code that prints out Ucf (with a capital U), # using the string variable s that # is assigned the string 'ucf'. s = 'ucf'
# for any string s = '< any string >' # which of these is always equivalent # to s? s[:] s + s[0:-1+1] s[0] s[:-1] s[:3] + s[3:]
One way to remember how slices work is to think of the indices as pointing between characters, with the left edge of the first character numbered 0. Then the right edge of the last character of a string of n characters has index n.
# +---+---+---+---+---+---+ # | P | y | t | h | o | n | # +---+---+---+---+---+---+ # 0 1 2 3 4 5 6 # -6 -5 -4 -3 -2 -1 # # The slice from i to j consists of all # characters between the edges labeled i and j print("Python"[-4:6]) # => 'thon'
Python strings cannot be changed - they are immutable. Therefore, assigining to an index position in the string results in a error.
word = 'Python' word[0] = 'M' # => TypeError: 'str' object does not support item assignment new_word = 'M' + word[1:] # => 'Mython'
The built-in function len() returns the length of a string.
s = 'megahypergigasuperlongstring' print(len(s))
# < string >.find(< string >) search_string.find(target_string) # => number of the first position # in search_string at which # target_string appears # => -1 if target_string is not found
# 11111111 # 012345678901234567 search_string = 'Python is so cool!' # note search_string is a VARIABLE target_string = 'cool' search_string.find(target_string) # => 13 search_string.find('boring') # => -1
# Which of the following evaluate to -1? 'test'.find('t') "test".find('st') "Test".find('te') 'west'.find('test')
# Assume that s is variable that stores # an arbitrary string # Which of the following always has the value 0? s.find(s) s.find('s') 's'.find(s) s.find('') s.find(s+'!!!')+1
# < string >.find(< string >, < number >) search_string.find(target_string, pos) # => number of the first position # in search_string at which # target_string appears # at of after pos # => -1 otherwise
# For any variables s and t that are strings, # a variable i that is a number, # which of the following is equivalent to s.find(t,i) s[i:].find(t) s.find(t)[:i] s[i:].find(t)+i s[i:].find(t[i:]) # none of these
HTML or HyperText Markup Language is the main markup language for creating web pages and other information that can be displayed in a web browser.
HTML is written in the form of HTML elements consisting of tags enclosed in angle brackets (like <html>), within the web page content.
HTML tags most commonly come in pairs like <h1> and </h1>, although some tags represent empty elements and so are unpaired, for example <img>.
The first tag in a pair is the start tag, and the second tag is the end tag (they are also called opening tags and closing tags). In between these tags web designers can add text, further tags, comments and other types of text-based content.
The purpose of a web browser is to read HTML documents and compose them into visible or audible web pages. The browser does not display the HTML tags, but uses the tags to interpret the content of the page.
HTML elements form the building blocks of all websites.
HTML allows images and objects to be embedded and can be used to create interactive forms. It provides a means to create structured documents by denoting structural semantics for text such as headings, paragraphs, lists, links, quotes and other items.
It can embed scripts written in languages such as JavaScript which affect the behavior of HTML web pages.
Web browsers can also refer to Cascading Style Sheets (CSS) to define the appearance and layout of text and other material.
To see an example of a super simple webpage, click here.
# Write Python code that initializes the variable # start_link to be the value of the position # at which <a href= occurs for the first time in # the string variable page
# Write Python code that initializes the variable # start_link to be the value of the position # at which <a href= occurs for the first time in # the string variable page start_link = page.find('<a href=')
# Write Python code that assigns to the variable # url a string that is the value of the first # URL that appears in a link tag in # the string variable page start_link = page.find('<a href=') # add missing code
# Write Python code that assigns to the variable # url a string that is the value of the first # URL that appears in a link tag in # the string variable page start_link = page.find('<a href=') start_quote = page.find('"', start_link) end_quote = page.find('"', start_quote + 1) url = page[start_quote + 1 : end_quote] print(url)