insertion sort algorithm has the complexity of O(n²) as 1 + 2 +3 +… +n = (n+1)*n/2

PyTorch provides two data primitives:

`torch.utils.data.DataLoader`

and`torch.utils.data.Dataset`

that allow you to use pre-loaded datasets as well as your own data.`Dataset`

stores the samples and their corresponding labels, and`DataLoader`

wraps an iterable around the`Dataset`

to enable easy access to the samples. From [1]

[1] gave a pretty good example of FashionMNIST

**import** torch

**from** torch.utils.data **import** Dataset

**from** torchvision **import** datasets

**from** torchvision.transforms **import** ToTensor

**import** matplotlib.pyplot **as** plt

training_data **=** datasets**.**FashionMNIST(

root**=**"data",

train**=True**,

download**=True**,

transform**=**ToTensor()

)

Let’s explore this code.

If we run the above code on the terminal, it will download the data first and…

LeetCode 1857 Largest Color Value in a Directed Graph

This problem is the 4th question of the weekly contest on May 8, 2021. In order to solve this problem, we should be familiar with:

- Topological sort
- How to check whether we have cycles in a graph: directed graph and undirected graph.
- DFS

Topological sort is a classical one, and I will not give explanation. Pls check this post to recall this classical algorithm. Actually, that post already tells us how to check whether it has cycle in a directed graph.

From two LeetCode questions. From those problems, we can learn:

- Two pointer
- Deque
- Binary search

`class Solution {`

public:

int minSubArrayLen(int target, vector<int>& nums) {

int i = 0, j = 0;

const int n = nums.size();

int ret = n + 1;

for (int j = 0; j < n; ++j) {

// update the nums to prefix_sum…

A discussion between the difference of a vector and a stack in C++ can be found here.

My question is if stack “**needs to be provided the underlying container**. By default it’s `std::deque`

, but it can be `std::vector`

or `std::list`

too.”. My questions are:

- Why don’t we directly use the vector? As vector seems faster than deque.
- When should we use stack instead of vector?

In order to fully answer those questions, I’d like to do some online searching.

From [1], it is pretty clear described about what is stack.

stacks are implemented ascontainer adaptors, which are classes that…

`A (4d array): 8 x 1 x 6 x 5`

B (3d array): 7 x 1 x 5

Result (4d array): 8 x 7 x 6 x 5

We can grasp the general idea from the above example. Looking from the right to left: if we only have those 3 cases, then the broadcasting is valid:

- dimension is the same
- one of the dimension is 1
- one of the dimensions does not exist.

For the above example, look from right to left

first dimension: A and B are the same, good

second dimension: B is one, good

third dimension: A is one, good

fourth dimension: B doesn't exist, good.

So the broadcasting is valid.

For more details see here.

The point of using

`emplace_back`

is to avoid creating a temporary object, which is then copied (or moved) to the destination. While it is also possible to create a temporary object, then pass that to`emplace_back`

, it defeats (at least most of) the purpose. What you want to do is pass individual arguments, then let`emplace_back`

invoke the ctor with those arguments to create the object in place. From [1]

The code is for a two pointer problem

`const int INF = 1e9;`

struct Query{

int id, prefered, minSize;

// Line 1…

Leetcode 1847 Closest Room

This two pointer problem is the 4th problem of last week’s biweekly contest problem.

Sort the query by minSize. Sort the room by size.

Then one pass to check the query from maximum minSize and add the satisified rooms whose size is larger than the required size into a set.

I remember there is a similar problem before, I forget that problem. I will add that paoblem in this article when I find it.

Two Sorts: O(m log m), O(n log n)

After sorting, when find the results, we are searching from a balanced binary search…

Lots of posts mentioned that GCD caculation is O(1). Why?

I’d like to share a detail analysis from a lecture note. Also this lecture note is a good one if you want to understand more about number thoery required for a CS student.

- First, GCD caculation is related to The Euclidean algorithm

“gcd(a0,a1) = gcd(a1,a2) = … = gcd(ak−1,ak) “ (check the lecture notes 2. The Euclidean algorithm to get details) - Second, GCD caculation:

“a_i is always an upper bound on some Fibonacci number “

“This shows that the number of steps in the Euclidean algorithm is logarithmic in a0, which means…

Multiset in C++ is a balanced bianry search tree.

Code

`typedef long long LL;`

class MKAverage {

public:

struct Range {

multiset<int> s;

LL sum = 0;

void insert(int x) {

s.insert(x);

sum += x;

}

void remove(int x) {…

Data Scientist/MLE/SWE @takemobi