# Sorting algorithms

--

Sorting is important and if is important without further explanation.

Nice visualizations can be found here.

Important sorting algorithms are:

O(N²) insertion, bubble, selection sort.

O(nlogn) algorithms are quick, merge, heap.

If algorithms are based on the swap, the complexity will be more than O(nlogn), however, we do have some O(n+k) algorithms in the special condition such as bucket sort.

More content can be found from Wikipedia.

In terms of best-case complexity:

For average O(N²) sorting algorithms, bubble sort and insertion sort in the best case complexity is O(n). However, the selection sort in the best case is still O(n²).

In terms of stable or not:

Bubble and insertion algorithms are stable, while selection sort is not stable.

Merge sort is stable, quicksort can be stable, heap sort is not stable.

Why stable sort and when do we need stable sort? The answers can be found here:

https://www.drdobbs.com/cpp/why-stable-sorting-is-important/231001864

A problem related to sorting can be found here

## 179. Largest Number

# Solution

A reference solution can be found HERE.

## 215. Kth Largest Element in an Array

Different Python2 solution from HERE by Cai Kehe

`# O(nlgn) time`

def findKthLargest1(self, nums, k):

return sorted(nums, reverse=True)[k-1]

# O(nk) time, bubble sort idea, TLE

def findKthLargest2(self, nums, k):

for i in xrange(k):

for j in xrange(len(nums)-i-1):

if nums[j] > nums[j+1]:

# exchange elements, time consuming

nums[j], nums[j+1] = nums[j+1], nums[j]

return nums[len(nums)-k]

# O(nk) time, selection sort idea

def findKthLargest3(self, nums, k):

for i in xrange(len(nums), len(nums)-k, -1):

tmp = 0

for j in xrange(i):

if nums[j] > nums[tmp]:

tmp = j

nums[tmp], nums[i-1] = nums[i-1], nums[tmp]

return nums[len(nums)-k]

# O(k+(n-k)lgk) time, min-heap

def findKthLargest4(self, nums, k):

heap = []

for num in nums:

heapq.heappush(heap, num)

for _ in xrange(len(nums)-k):

heapq.heappop(heap)

return heapq.heappop(heap)

# O(k+(n-k)lgk) time, min-heap

def findKthLargest5(self, nums, k):

return heapq.nlargest(k, nums)[k-1]

# O(n) time, quick selection

def findKthLargest(self, nums, k):

# convert the kth largest to smallest

return self.findKthSmallest(nums, len(nums)+1-k)

def findKthSmallest(self, nums, k):

if nums:

pos = self.partition(nums, 0, len(nums)-1)

if k > pos+1:

return self.findKthSmallest(nums[pos+1:], k-pos-1)

elif k < pos+1:

return self.findKthSmallest(nums[:pos], k)

else:

return nums[pos]

# choose the right-most element as pivot

def partition(self, nums, l, r):

low = l

while l < r:

if nums[l] < nums[r]:

nums[l], nums[low] = nums[low], nums[l]

low += 1

l += 1

nums[low], nums[r] = nums[r], nums[low]

return low

## Quickselect

Similar to quicksort, the quick select algorithm can be used here to solve the problem with the complexity of:

Quickselect uses the same overall approach as quicksort, choosing one element as a pivot and partitioning the data in two based on the pivot, accordingly as less than or greater than the pivot. However, instead of recursing into both sides, as in quicksort, quickselect only recurses into one side — the side with the element it is searching for. This reduces the average complexity from O(

nlogn) to O(n), with a worst case of O(n^2). -wikipedia

## Why quick select is O(n) if we have a perfect pivot chosen?

Because, if it is perfectly chosen, the data size will follow a 1,1/2, 1/4, …pattern. The sum of this geometric series is 1+1 = 2. So the complexity is O(2n)=O(n).

## When will we get the worst-case complexity for quickselect algorithm

Like quicksort, the quickselect has good average performance, but is sensitive to the pivot that is chosen. If good pivots are chosen, meaning ones that consistently decrease the search set by a given fraction, then the search set decreases in size exponentially and by induction (or summing the geometric series) one sees that performance is linear, as each step is linear and the overall time is a constant times this (depending on how quickly the search set reduces). However, if bad pivots are consistently chosen, such as decreasing by only a single element each time, then worst-case performance is quadratic: O(

n2). This occurs for example in searching for the maximum element of a set, using the first element as the pivot, and having sorted data. — wikipedia

Pseudocode from Wikipedia

functionpartition(list, left, right, pivotIndex)is

pivotValue := list[pivotIndex]

swap list[pivotIndex] and list[right]// Move pivot to end

storeIndex := left

forifromlefttoright − 1do

iflist[i] < pivotValuethen

swap list[storeIndex] and list[i]

increment storeIndex

swap list[right] and list[storeIndex]// Move pivot to its final place

returnstoreIndex// Returns the k-th smallest element of list within left..right inclusive// (i.e. left <= k <= right).functionselect(list, left, right, k)is

ifleft = rightthen// If the list contains only one element,

returnlist[left]// return that element

pivotIndex := ...// select a pivotIndex between left and right,

// e.g.,left + floor(rand() % (right − left + 1))

pivotIndex := partition(list, left, right, pivotIndex)

// The pivot is in its final sorted position

ifk = pivotIndexthen

returnlist[k]

else ifk < pivotIndexthen

returnselect(list, left, pivotIndex − 1, k)

else

returnselect(list, pivotIndex + 1, right, k)

## Direct translate pseudocode to C++ code

A direct translated C++ code from the pseudocode of the [quick select](https://en.wikipedia.org/wiki/Quickselect) algorithm mentioned from Wikipedia.

It is pretty fast and the run time is

/*

Runtime: 16 ms, faster than 95.11% of C++ online submissions for Kth Largest Element in an Array.

Memory Usage: 10 MB, less than 96.23% of C++ online submissions for Kth Largest Element in an Array.

*/

class Solution {

public:

int partition(vector<int>& list, int left, int right,int pivot_index)

{

auto pivot_value = list[pivot_index];

swap(list[pivot_index], list[right]);

auto store_index = left;

for(int i=left; i<=right;++i)

{

if (list[i] < pivot_value)

{

swap(list[i], list[store_index++]);

}

}

swap(list[store_index], list[right]);

return store_index;

}

int select(vector<int>& list, int left, int right,int k)

{

if (left==right)return list[left];

auto pivot_index = left + rand()%(right - left + 1);

pivot_index = partition(list, left, right, pivot_index);

if (k==pivot_index)return list[pivot_index];

else if(k<pivot_index)

{

return select(list, left, pivot_index-1, k);

}

else

{

return select(list, pivot_index+1, right, k);

}

}

int findKthLargest(vector<int>& nums, int k)

{

return select(nums, 0, nums.size()-1, nums.size()-k);

}

};

# 148. Sort List

A nice video about this problem can be found in [2]