cs106b课堂笔记

 

惭愧了 这么久才又开始学习

abstration

defination

Design that hides the details of how something works while still allowing the user to access complex functionality

what is an abstraction

Key idea:Through a simpler interface, users are able to take full advantage of a complex system without needing to know how it works or how it was made.

includes

angle bracket

use of the angle bracket operators is usually reserved for code from the C++ Standard library

quote

use of the quotes is usually reserved for code from the Stanford C++ libraries, or code in files that you have written yourself

variables and types

variables

A way for code to store information by associating a value with a name
we will think of a variable as a named container storing a value

types

All variables have a type associated with them, where the type describes the representation of the variable.
In C++,all types must be explicitly defined when the variable is created, and a variable cannot change its type.

funtion

anatomy of a function

function

definition:
parameter(s):
one or more variables that your function expects as input.

argument(s):
the values passed into your function and assigned to its parameter variables.

return value:
the value that your function hands back to the "calling" function.
can think parameters as a special set of local variables that belong to a function
returnType functionName(varType parameter1, varType parameter2, ...){
    returnType variable = /* Some fancy code */
    /* more code */
    return variable;
}
a function must always be defined before it is called

Pass-by-value is the default mode of operation when it comes to parameters in C++,which means that parameter is passed to function by value,variable is a copy of the variable passed in,changing it inside the function does not change the value in the calling function.

defenition:
pass by value:When a parameter is passed into a function, the new variable stores a copy of the passed in value in memory.
pass by reference:When a parameter is passed into a function, the new variable stores a reference to the passed in value, which allows you to directly edit the original value.

When we use references

  • To allow helper functions to edit data stuctures in other functions
  • To avoid making new copies of large data structures in memory
  • References also provide a workaround for multiple return values
    • Your function can both have a return value and also directly edit a Vector object passed in as a parameter.This makes it as if your function is returning both the vector and the actual return value

When we don’t use references

  • If we always used references,functions would all be able to edit one another’s variables, and scoping would get confusing
  • When the data itself is small(i.e. the cost of copying by value is low)
  • Note: You can’t provide a literal as an argument if you are passing a parameter by reference.

循环什么的不想记(瘫

string

  • Strings are mutable in C++
  • You can add characters to strings and strings to strings using += and +
    • Strings must use double quotes(““)while characters use single(‘’)
  • You can use logical operators to compare strings(and characters)(only comparing the first letter)

when running

string hiThere = "hi"+"there";

you would get a error
cause there are cpp string and c string in cpp,when declare a string like “hi” you would get a c string,which is not allowed to be added together.

when running

string hiThere = "hi"+'!';

you would get a garbage
cause when add a different type to a c string, it would point to random garbage.

C strings vs C++ strings summary

  • C strings have no methods
    • This is why you can’t do something like “hi”.length() in C++
  • Conversion fixes
    • Store the C string in a variable first to convert it to a C++ string
    • Use a conversion function
      • string(“text”); // converts the C string literal into a C++ string
      • strig.c_str(); // return a C string from a C++ string

Takeaway:Beware of C string

Abstract Data Types

Vector

  • At a high level,a vector is an ordered collection of elements of the same type that can grow and shrink in size.
    • Each element in the collection has a specific location, or index
    • All elements in a vector must be of the same type.Unlike in other programming languages, a single vector cannot contain elements of mixed types.
    • Vectors are fixible when it comes to the number of elements they can store.You can easily add and remove elements from a vector.Vectors also know their size, meaning you can query them to see how many elements they currently contain. vector

Grid(In Stanford library)

grid

  • A 2D array, defined with a particular width and height
    • we say array instead of vector here because the dimensions are established when the grid is created
  • useful for spreadsheets,game boards,etc
  • Three ways to declare a grid
    • Grid gridName;
      Grid<int> board; \\ if you declare a board with no initialization,you must resize it or reassign it before using it.Resizing will fill it with default values for that type.
      board.resize(3,3);
      board[0][0] = 2;
      
    • Grid gridName(numRows,numCols);
      Grid<int> board(3,3);
      
    • Grid gridName = {{r0c0,r0c1,r0c2},{r1c0,r1c1,r1c2},...};
      Grid<int> board = {{2,0},{6,0}};
      
  • We could use a combination of Vectors to simulate a 2D matrix,but a Grid is easier

Tradeoffs with vectors and grids(vs. other ADTs)

  • Easily able to search through all elements
  • Can use the indices as a way of structuring the data

Struct

A way to bundle different types of information in C++,like creating a custom data structure.

  • To declare a struct,you can either assign each of its members separately or assign it when it’s created

Queue

  • First person In is the First person Out(FIFO)
    • When you remove(dequeue)people from the queue, you remove them from the front of the line
  • Last person in is the last person served
    • When you insert(enqueue)people into a queue, you insert them at the back(the end of the line)

Stack

stack

  • Modeled like an actual pancakes🥞
  • Only the top element in a stack is accessible
    • The Last item In is the First one Out(LIFO)
  • The push, pop, and top operations are the only operations allowed by the stack ADT

Common bugs:

  • If you edit the ADT within a loop,don’t use .size() in the loop’s conditions!The size changes while the loop runs.
  • Unlike with queues,you can’t iterate over a stack without destroying it

Tradeoffs with queues and stacks(vs. other ADTs)

  • What are some downsides to using a queue/stack?
    • No random access.You get the front/top,or nothing.
    • No side-effect-free traversal – you can only iterate over all elements in the structure by removing previous elements first.
    • No easy way to search through a queue/stack
  • What are some benefits?
    • Useful for lots of problem – many real-world problems can be solved with either a LIFO or FIFO model
    • Very easy to build one from array such that access is guaranteed to be fast.

Set

  • A set is a collection of elements with no duplicates
  • Sets are faster than ordered data structures like vectors – since there are no duplicate, it’s faster for them to find things
  • Sets don’t have indices

Set operands

Sets can be compared,combined,etc.

  • s1 == s2
    true if the sets contain exactly the same elements
  • s1 != s2
    true if the sets don’t contain the exact same elements
  • s1 + s2
    returns the union of s1 and s2(i.e.,all elements in both)
  • s1 * s2
    return the intersection of s1 and s2(i.e.,only the elements in both sets)
  • s1 - s2
    returns the different of s1 and s2(the elemenets in s1 but not in s2)

Common Set patterns and pitfalls

  • Use for each loops to iterate over sets ``` for(type currElem : set){ // process elements one at a time }
  • You cannot use anything that attempts to index into the set(e.g.for(int i = 0; .. ) or set[i])

Map

map

  • A map is a collection of key/value pairs, and the key is used to quickly find the value
  • A map is an alternative to ordered data structure, where the “indices” no longer need to be integers.

Map example(Using Stanford library)

// maps from string keys to string values
Map<string, string> phoneBook;

//          key         value
phoneBook["Jenny"] = "867-5309"; //or
phoneBook.put("Jenny", "867-5309");

string jennyNumber = phoneBook["Jenny"]; // or
string jennyNumber = phoneBook.get("Jenny");
cout << jennyNumber << endl;

// maps from string keys to Vector<double> values
Map<string, Vector<double>> accounts;

Common Map patterns and pitfalls

  • Use for each loops to iterate over maps
    for (type curKey : map){
      // see map values using map[curKey]
    }
    

    Don’t remove keys within the loop as you’re iterating

    for (type curKey : map.keys()){
      // see map values using map[curKey]
    }
    

    Okay to edit map within this loop because .values()/.keys() makes a Vector copy of the values/keys.

  • How to check whether the key is in the map:
    Map<string, int> freqMap;
    ...
    // get key to test if it's in the map
    if (freqMap.containsKey(key)){
      cout << key << " is in the map" << endl;
    }
    

Breadth-First Search(BFS)

引入的问题:如何每次变换一个字母且每次变换后都是一个存在的英语单词,让一个单词变为另一个单词

Data Structures

  • A data structue to represent(partial word)ladders
    • Stack
  • A data structure to store all the partial word ladders that we have generated so far and have yets to explore
    • Queue<Stack>
  • A data structure to keep track of all the words that we’ve explored so far, so that we avoid getting stuck in loops
    • Set

Pseudocode

Create an empty queue and an empty set of visited locations
Create an initial word ladder containing the starting word and add it to the queue
While the queue is not empty
 Remove the next partial ladder from the queue
 Set the current search word to be the word at the top of the ladder
 If the current word is the destination, then return the current ladder
 Generate all “neighboring” words that are valid English words and one letter away from the current word
 Loop over all neighbor words
  If the neighbor hasn’t been visited
   Create a copy of the current ladder
   Add the neighbor to the top of the new ladder and mark it visited
   Add the new ladder to the back of the queue of partial ladders

BFS vs DFS comparison

  • BFS is typically iterative while DFS is naturally expressed recursively.
  • BFS looks at all paths of a particular case, which search strategy to use depends on the problem you’re solving.
  • DFS doesn’t need to store all partial paths along the way, so it has a smaller memory footprint than BFS does.

Big-O Notation and Algorithmic Analysis

Nested Data Structures(嵌套数据结构)

Nesting data structures(using one ADTs as the data type inside of another ADT)is a great way of organizing data with complex structure

example

20241110163247 20241110163601 20241110163713

Summary

  • Powerful
    • Can express highly structured and complex data
    • Used in many real-world systems
  • Tricky
    • With increased complexity comes increased cognitive load in differentiating between the levels of information stored at each level of the Nesting
    • Specially in cpp, working with nested data structures can be tricky due to the fact that references and copies show up at different points in time.

Big-O Noatation

  • Big-O notation is a way of quantifying the rate at which quantity grows.
  • Example:
    • A square of side length r has area O($r^2$)
    • A circle of radius r has area O($r^2$)
    • the $r^2$ just says that these quantities grow at the same relative rates.It does not say that they are equal.

Example

Manufacturing
You’re working at a compony producing cat toys.It costs you some amount of money to produce a cat toy, and there was some one-time cost to set-up the factory.
What data would you need to gather to estimate the cost of producing ten million cat toys?

cost(n) = n * costPerWidget(This term grows as a function of n) + startupCost(This term does not grow) = O(n)

Nuances of Big-O

  • Big-O notation is designed to capture the rate at which a quantity grows.It does not capture information about
    • leading coefficients: the area of a square and a circle are both O($r^2$)
    • lower-order terms: there may be other factors contributing to growth that get glossed over
  • However,it’s still a very powerful tool for predicting behavior

Why runtime isn’t enough?

  • What is runtime?
    • Runtime is simply the amount of real time it takes for a program to run
  • Measuring wall-clock runtime is less than ideal, since
    • It depends on what computer you’re using,
    • What else is running on that computer,
    • Whether that computer is conserving power,
    • etc
  • Worse, individual runtimes can’t predict future runtimes

Analyze

int vectorMax(Vector<int> &v){ // 1 time unit
  int currentMax = v[0];       // 1 time unit
  int n = v.size();            // 1 time unit
  for(int i = 1; i < n; i++){  // (N+1)time units
    if(currentMax < v[i]){     // N time units
      currentMax = v[i];       // N time units
    }                          // (up to)N time units
  }                            // 1 time unit
  return currentMax;
} 
// doubling the size of the input roughly doubles the runtime.Therefore, the input and runtime have a linear(O(N))relationship. 

20241111154545

Recursion

definition

A problem-solving technique in which tasks are completed by reducing them into repeated, smaller tasks of the same form.

Two main cases (components) of recursion

  • Base case
    • The simplest version(s) of your problem that all other cases reduce to
    • An occurrance that can be answered directly
  • Recursive case
    • The step at which you break down more complex versions of the task into smaller occurrances
    • Cannot be answered directly
    • Take the “recursive leap of faith” and trust the smaller tasks will solve the problem for you

Stack Frame

gets created each time a function is called.

  • The “stack” is where in your computer’s memory the information is created.
  • A “frame” stores all of the data(variables)for that particular function call.

Approaching recursive problems

  • Look for self-similarity(An object is self-similar if it contains a smaller copy of itself)
  • Try out an example
    • Work through a simple example and then increase the complexity
    • Think about what information needs to be “stored” at each step in the recursive case
  • Ask yourself:
    • What is the base case?(What is the simplest case?)
    • What is the recursive case?(What pattern of self-similarity do you see?)
How can we reserve a string

20241111210807 20241111210935 20241111211027

Summary

  • Recursion is a problem-solving technique in which tasks are completed by reducing them into repeated,smaller tasks of the same form
  • Recursion has two main parts: the base case and the recursive case
  • The solution will get built up as you come back up the call stack
    • The base case will define the “base” of the solution you’re building up.
    • Each previous recursive call contributes a little bit to the final solution
    • The initial call to your recursive function is what will return the completely constructed answer.

Fractals

  • A fractal is any repeated, graphical pattern
  • A fractal is composed of repeated instances of the same shape or pattern, arranged in a structured way
  • Fractals and self-similar structures are often difined in terms of some parameter called the order, which indicates the complexity of the overall structure.
G-window and G-point

20241111215005 20241111215039

Why do we use recursion

  • Elegance
    • Allows us to solve problems with very clean and concise code
  • Efficiency
    • Allows us to accomplish better runtimes when solving problems
  • Dynamic -Allows us to solve problems that are hard to solve iteratively

Finding a number in a sorted list:
1.Go through the whole vector. Linear Search is O(n)

Note:
Sets and Maps don't actually use a sorted list to store information, but the general idea of searching sorted data is similar.

2.Binary search

  • Eliminate half of the data at each step.
  • Algorithm:Check the middle element at(startIndex + endIndex) / 2
    • If the middle element is bigger than your desired value, eliminate the right half of the data and repeat.
    • If the middle element is smaller than your desired value, eliminate the left half of the data and repeat.
    • Otherwise,you’ve found your element!
  • Recursive cases
    • Element at middle is too small -> binarySearch(right half of data)
    • Element at middle is too large -> binarySearch(left halt of data)
  • Base cases
    • Element at middle == desired element
    • Desired element is not in your data

20241119204231 20241119204259 20241119204502

A dynamic examle: Exploring many possiblities

20241119205908

Two kinds of recursive

20241119210118

When use backtracking recursion
  • We can generate all possible solutions to a problem or count the total number of possible solutions
  • We can find one specific solution to a problem or prove that one exists
  • We can find the best possible solution to a given problem

Permutations

A permutations of a sequence is a sequence with the same elements,though possibly in a different order.

A conclusion about decision tree’s recursive

  • The specific model of the general “choose/explore/unchoose”pattern in backtracking recursion that we applied here can be thought of as “copy,edit,recurse”
  • At each step of the recursive backtracking process, it is important to keep track of the decision we’ve made so far and the decisions we have left to make
  • Backtracking recursion can have variable branching factors at each level
  • Use of helper functions and initial empty params that get built up is common

What defines our shrinkable decision tree

  • Decision at each step(each level of the tree)
  • Options at each decision(branches from each node)
  • Information we need to store along the way

Data Struct

Arrays

  • Storage space on computers, which we often refer to as memory, is allocated in organized chunks called arrays.
  • An array is a contiguous chunk of space in the computer’s memory, split into slots, each of which can contain one piece of information.
    • Contiguous means that each slot is located directly next to the others. There are no “gaps”.
    • All arrays have a specific type. Their type dictates what information can be held in each slot.
    • Each slot has an “index” by which we can refer to it.

20250312184606

Pointers

  • Pointers take up space in memory and can store specific values.
  • A pointer always stores a memory address, which is like the specific coordinates of where a piece of memory exists on the computer.

20250312185435

Priority queue

  • A queue that orders its elements based on a provided “priority”
  • Like regular queues, you cannot index into them to get an item at a particular position.

Binary heap

  • A heap is a tree-based structure that satisfies the heap property that parents have a higher priority than any of their children.
  • Additional properties
    • Binary: Two children per parent(but no implied orderings between siblings)
    • Completely filled(each parents must have 2 children)except for the bottom level,which gets populated from left to right
  • Two types -> which we use depends on what we define as a “higher” priority
    • Min-heap: smaller numbers = higher priority(closer to the root)
    • Max-heap: larger numbers = higher priority(closer to the root)

Linked list

  • A linked list is a chain of nodes, used to store a sequence of data.
  • Each node contains two pieces of information:
    • Some piece of data that is stored in the sequence
    • A link to the next node in the list
  • We can traverse the list by starting at the first node and repeatedly following its link.
  • The end of the list is marked with some special indicator.

Something about list

  • Linked lists are chains of Node structs, which are connected by pointers.
    • Since the memory is not contiguous, they allow for fast rewiring between nodes(without moving all the other Nodes like an array might)
  • Common traversal strategy
    • While loop with a pointer that starts at the front of your list
    • Inside the while loop, reassign the pointer to the next node
  • Common bugs
    • Be careful about the order in which you delete and rewire pointers!
    • It’s easy to end up with dangling pointers or memory leaks(memory that hasn’t been deallocated but that you not longer have a pointer to)

Pointers by Reference Summary

  • If you passed a pointer into a function by value, you can change the contenets at the object you point at, but not which object you point at.
  • If you pass a pointer into a function by reference, you can also change which object is pointed at.
  • When passing in pointers by reference, be careful not to change the pointer unless you really want to change where it’s pointing.

Tree

Key Idea: The distance from each element (node) in a tree to the top of the tree (the root) is small, even if there are many elements.

A tree is hierarchical data organization structure composed of a root value linked to zero or more non-empty subtrees.

Tree Terminology Summary

  • Every non-empty tree has a root node that defines the “top” of the tree.
  • Every node has 0 or more children nodes descended from it.Nodes with no children are called leaf nodes .
  • Every node in a tree has exactly one parent node(except for the root node)
  • A path through the tree traverses adges between parents and their children.
  • The depth of a node is the length of the path between the root and that node.A tree’s height is the number of nodes in the longest path through the tree.

Tree Properties

  • Any node in a tree can only have one parent.
  • The tree cannot have any cycles.There should be no way to make a complete loop through the tree.

Binary Trees

  • In general, we’ve seen that nodes in a tree can have variable numbers of children(subtrees) and sometimes very, very many.
  • However, when working with trees in computer programs, it is common to work mostly with binary trees.
  • A binary tree is a tree where every node has either 0,1 or 2 children.No node in a binary tree can have more than 2 children.
  • Typically, the two children of a node in a binary tree are refrerred to as the left child and the right child.

20250316205755

Building a BST

BST即右节点值大于左节点的值的二叉树 </br> An optimal BST is built by repeatedly choosing the median element as the root node of a given subtree and then separating elements into groups less than and greater than that median. </br> There are multiple valid BSTs for the same set of data.

20250316213325

Some other thing

hash function

20250316221302 20250316221456 20250316221619 20250316221718 20250316221743 20250316222453

Graph

A structured way to represent relationships between different entities.

20250320171929

Two nodes are neighbors if they are directly connected by an edge.
A path between two nodes is defined be a sequence of edges that can be followed to traverse between the two nodes.
The length of a path is the number of edges that make up the path.This path has length 2.
A cycle is a path that begins and ends at the same node.
A loop is an edge directly from a node back to itself.
A node is reachable from another node if a path exists between the two nodes.
A graph is connected if all nodes are reachable from all other nodes.
A graph is complete if every node has an edge connecting it to every other node.

Different types of graphs

20250320173216 20250320173313 20250320173414