# Sorting algorithms

Sorting is important and if is important without further explanation.

Nice visualizations can be found here.

Important sorting algorithms are:

O(N²) insertion, bubble, selection sort.

O(nlogn) algorithms are quick, merge, heap.

If algorithms are based on the swap, the complexity will be more than O(nlogn), however, we do have some O(n+k) algorithms in the special condition such as bucket sort.

More content can be found from Wikipedia.

In terms of best-case complexity:

For average O(N²) sorting algorithms, bubble sort and insertion sort in the best case complexity is O(n). However, the selection sort in the best case is still O(n²).

In terms of stable or not:

Bubble and insertion algorithms are stable, while selection sort is not stable.

Merge sort is stable, quicksort can be stable, heap sort is not stable.

Why stable sort and when do we need stable sort? The answers can be found here:

https://www.drdobbs.com/cpp/why-stable-sorting-is-important/231001864

A problem related to sorting can be found here

# Solution

A reference solution can be found HERE.

## 215. Kth Largest Element in an Array

Different Python2 solution from HERE by Cai Kehe

`# O(nlgn) timedef findKthLargest1(self, nums, k):    return sorted(nums, reverse=True)[k-1]    # O(nk) time, bubble sort idea, TLEdef findKthLargest2(self, nums, k):    for i in xrange(k):        for j in xrange(len(nums)-i-1):            if nums[j] > nums[j+1]:                # exchange elements, time consuming                nums[j], nums[j+1] = nums[j+1], nums[j]    return nums[len(nums)-k]    # O(nk) time, selection sort ideadef findKthLargest3(self, nums, k):    for i in xrange(len(nums), len(nums)-k, -1):        tmp = 0        for j in xrange(i):            if nums[j] > nums[tmp]:                tmp = j        nums[tmp], nums[i-1] = nums[i-1], nums[tmp]    return nums[len(nums)-k]    # O(k+(n-k)lgk) time, min-heapdef findKthLargest4(self, nums, k):    heap = []    for num in nums:        heapq.heappush(heap, num)    for _ in xrange(len(nums)-k):        heapq.heappop(heap)    return heapq.heappop(heap)# O(k+(n-k)lgk) time, min-heap        def findKthLargest5(self, nums, k):    return heapq.nlargest(k, nums)[k-1]    # O(n) time, quick selectiondef findKthLargest(self, nums, k):    # convert the kth largest to smallest    return self.findKthSmallest(nums, len(nums)+1-k)    def findKthSmallest(self, nums, k):    if nums:        pos = self.partition(nums, 0, len(nums)-1)        if k > pos+1:            return self.findKthSmallest(nums[pos+1:], k-pos-1)        elif k < pos+1:            return self.findKthSmallest(nums[:pos], k)        else:            return nums[pos] # choose the right-most element as pivot   def partition(self, nums, l, r):    low = l    while l < r:        if nums[l] < nums[r]:            nums[l], nums[low] = nums[low], nums[l]            low += 1        l += 1    nums[low], nums[r] = nums[r], nums[low]    return low`

## Quickselect

Similar to quicksort, the quick select algorithm can be used here to solve the problem with the complexity of:

Quickselect uses the same overall approach as quicksort, choosing one element as a pivot and partitioning the data in two based on the pivot, accordingly as less than or greater than the pivot. However, instead of recursing into both sides, as in quicksort, quickselect only recurses into one side — the side with the element it is searching for. This reduces the average complexity from O(n log n) to O(n), with a worst case of O(n^2). -wikipedia

## Why quick select is O(n) if we have a perfect pivot chosen?

Because, if it is perfectly chosen, the data size will follow a 1,1/2, 1/4, …pattern. The sum of this geometric series is 1+1 = 2. So the complexity is O(2n)=O(n).

## When will we get the worst-case complexity for quickselect algorithm

Like quicksort, the quickselect has good average performance, but is sensitive to the pivot that is chosen. If good pivots are chosen, meaning ones that consistently decrease the search set by a given fraction, then the search set decreases in size exponentially and by induction (or summing the geometric series) one sees that performance is linear, as each step is linear and the overall time is a constant times this (depending on how quickly the search set reduces). However, if bad pivots are consistently chosen, such as decreasing by only a single element each time, then worst-case performance is quadratic: O(n2). This occurs for example in searching for the maximum element of a set, using the first element as the pivot, and having sorted data. — wikipedia

Pseudocode from Wikipedia

`function partition(list, left, right, pivotIndex) is    pivotValue := list[pivotIndex]    swap list[pivotIndex] and list[right]  // Move pivot to end    storeIndex := left    for i from left to right − 1 do        if list[i] < pivotValue then            swap list[storeIndex] and list[i]            increment storeIndex    swap list[right] and list[storeIndex]  // Move pivot to its final place    return storeIndex// Returns the k-th smallest element of list within left..right inclusive// (i.e. left <= k <= right).function select(list, left, right, k) is    if left = right then   // If the list contains only one element,        return list[left]  // return that element    pivotIndex  := ...     // select a pivotIndex between left and right,                           // e.g., left + floor(rand() % (right − left + 1))    pivotIndex  := partition(list, left, right, pivotIndex)    // The pivot is in its final sorted position    if k = pivotIndex then        return list[k]    else if k < pivotIndex then        return select(list, left, pivotIndex − 1, k)    else        return select(list, pivotIndex + 1, right, k)`

## Direct translate pseudocode to C++ code

A direct translated C++ code from the pseudocode of the [quick select](https://en.wikipedia.org/wiki/Quickselect) algorithm mentioned from Wikipedia.
It is pretty fast and the run time is

`/*Runtime: 16 ms, faster than 95.11% of C++ online submissions for Kth Largest Element in an Array.Memory Usage: 10 MB, less than 96.23% of C++ online submissions for Kth Largest Element in an Array.*/class Solution {public:    int partition(vector<int>& list, int left, int right,int pivot_index)    {        auto pivot_value = list[pivot_index];        swap(list[pivot_index], list[right]);        auto store_index = left;        for(int i=left; i<=right;++i)        {            if (list[i] < pivot_value)            {                swap(list[i], list[store_index++]);            }         }        swap(list[store_index], list[right]);        return store_index;    }        int select(vector<int>& list, int left, int right,int k)    {        if (left==right)return list[left];        auto pivot_index = left + rand()%(right - left + 1);        pivot_index = partition(list, left, right, pivot_index);        if (k==pivot_index)return list[pivot_index];        else if(k<pivot_index)        {             return select(list, left, pivot_index-1, k);        }        else        {            return select(list, pivot_index+1, right, k);        }               }    int findKthLargest(vector<int>& nums, int k)     {        return select(nums, 0, nums.size()-1, nums.size()-k);    }};`