Binary Search
What is Binary Search?
Binary Search is an efficient divide-and-conquer algorithm used to find the position of a target value within a sorted array. It works by repeatedly dividing the search range or interval in half until the target value is found or the search range is empty.
Divide-and-conquer is algorithm design paradigm that works by recursively breaking down a problem into smaller subproblems, then solving each subproblem, and combining the solutions to solve the original problem. Think this algorithm like you want to solve a big puzzle.
Three main steps:
- Divide: Break/Split the problem into smaller subproblems
- Conquer: Solve the subproblems recursively
- Combine: Merge the solutions of the subproblems to solve the original problem
Key Characteristics:
- Recursion
- Subproblem Independence
- Best case for problems that can be broken down into smaller, similar problems
When to use:
- Problems that can be divided into smaller subproblems
- Need better time complexity than native approaches like brute-force
Here is one of the basic example: Assuming you want to find the maximum number in an array [8, 3, 9, 2, 7, 1, 5, 4].
First, we will divide the array into two halves:
- Left half:
[8, 3, 9, 2] - Right half:
[7, 1, 5, 4]
[8, 3, 9, 2] | [7, 1, 5, 4]
[8, 3] | [9, 2] | [7, 1] | [5, 4]
[8] | [3] | [9] | [2] | [7] | [1] | [5] | [4]Next, we will conquer each half by finding the maximum number in each sub-array recursively:
- Maximum of left half:
9 - Maximum of right half:
7
Finally, we will combine the results by comparing the two maximums:
- Overall maximum:
max(9, 7) = 9s
max(8,3)=8 | max(9,2)=9 | max(7,1)=7 | max(5,4)=5
max(8,9)=9 | max(7,5)=7
max(9,7)=9 ← Final answer!Advantanges
- Efficiency: Often reduces time complexity (eg, O(n^2) to O(n log n))
- Simplicity: Breaks complex problems into manageable parts
- Parallelism: Subproblems can often be solved in parallel
- Cache Performance: Smaller subproblems may fit better in cache
Disadvantages
- Overhead: Recursive calls can add overhead
- Memory Usage: May require additional memory for recursion stack
- Not Always Applicable: Not all problems can be divided effectively, as some problems have better solutions
| Problem | Naive Approach | Divide & Conquer |
|---|---|---|
| Sorting | O(n²) | O(n log n) |
| Searching | O(n) | O(log n) |
| Multiplication | O(n²) | O(n¹.⁵⁸) |
There are some key requirements for Binary Search to work effectively:
- The array must be sorted in ascending or descending order.
- Random access to elements is required (i.e., the data structure should allow direct access to any element by its index, like an array, not a linked list).
- The data is static (not frequently changing), as frequent insertions or deletions would require re-sorting the array.
O(log n) indicates that the time complexity of an algorithm grows logarithmically with the size of the input data set. This means that the time grows very slowly as the input size increases.
The base of the logarithm is typically 2.
For example, if you have a dataset of size 1,000,000, an O(log n) algorithm would take about 20 steps to process it, because log2(1,000,000) is approximately 20.
Best Case: O(log n) - The target element is found at the middle position in the first comparison.
Worst Case: O(log n) - The target element is not present, and the search space is halved until it becomes empty.
Average Case: O(log n) - On average, the search space is halved with each comparison.
For space complexity, Binary Search can be implemented in two ways:
- Iterative Approach: Uses O(1) space since it only requires a constant amount of additional memory for variables.
- Recursive Approach: Uses O(log n) space due to the call stack created by recursive function calls.`
- Large sorted datasets where efficiency is crucial
- Situations where quick lookups are needed
- Applications with strict performance requirements
- Phonebook, dictionary, file systems, etc
Now, you need to remember that Binary Search cannot apply to unsorted data structure, as if the data is not sorted, it will lead to incorrect results.
How it works?
Basically, we will initialize two pointers or indices to represent the current search range within the array:
leftorlow- pointing to the start of the array(index 0)rightorhigh- pointing to the end of the array(index length - 1)mid- pointing to the middle element of the current search range(left + right) / 2
We will use mid value to compare with the target value. If the mid value is equal to the target value, we have found the target and return its index. If the mid value is less than the target value, we know that the target must be in the right half of the array, so we update the left pointer to mid + 1. If the mid value is greater than the target value, we know that the target must be in the left half of the array, so we update the right pointer to mid - 1. We repeat this process until we either find the target or the search range becomes empty (i.e., left exceeds right).
I will explain in detail in the implementation section below.
- Standard Binary Search - Finds the exact target value.
- First/Last Occurrence (Lower/Upper Bound) Binary Search - Finds the first/last occurrence of a target in an array with duplicates.
- Lower Bound: Aiming to find the leftmost index of the target which is the first instance/occurrence of the target.
- Upper Bound: Aiming to find the rightmost index of the target which is the last instance/occurrence of the target.
- Binary Search in Rotated Array - Finds a target in a sorted array that has been rotated at some pivot point.
Why it’s so fast?
If we spoke why Binary Search is so fast compared to Linear Search, assuming we have a sorted array of 1,000,000 elements and we want to find a specific element.
In Linear Search, in the worst case, we might have to check each element one by one until we find the target or reach the end of the array. This means we could potentially make up to 1,000,000 comparisons.
But in Binary Search, we start by checking the middle element of the array. If the middle element is not the target, we can eliminate half of the array from consideration based on whether the target is less than or greater than the middle element. This halving (reduction by half) continues with each comparison.
In contrast, array with 1,000,000 elements:
- Linear Search: Up to 1,000,000 comparisons in the worst case. O(n)
- Binary Search: Only about 20 comparisons in the worst case. O(log n)
- 1st comparison: Check the middle element (500,000)
- 2nd comparison: Check the middle of the remaining half (250,000)
- 3rd comparison: Check the middle of the remaining quarter (125,000)
- …
- This halving continues until we narrow down to the target element.
Visual Analogy
Image you have a phone book with 1024 pages and you want to find a specific name. Instead of starting from page 1 and checking each name one by one (like Linear Search), you can open the book to the middle page (page 512) and see if the name you’re looking for is before or after that page.
- 1st guess: Page 512
- 2nd guess: Page 256 or 768 (depending on whether the name is before or after page 512)
- 3rd guess: Page 128, 384, 640, or 896
- …
- 10th guess: You will find the name within 10 guesses because each guess effectively halves the number of pages you need to consider.
So, you only need 10 guesses to search through 1024 pages.
Implementation
In this variation, you’re given a sorted array and a target value. The goal is to find whether the target exists in the array and return its index.
def binary_search(arr: list, target: int) -> int:
left, right = 0, len(arr) - 1
while left <= right:
mid = (left + right) // 2
if target > arr[mid]:
left = mid + 1
elif target < arr[mid]:
right = mid - 1
else:
return mid
return -1
arr = [1, 2, 3, 4, 5]
target = 4
result = binary_search(arr, target)
if result != -1:
print(f"Target {target} found at index: {result}")
else:
print(f"Target {target} not found")Let’s search for target value 4 in the sorted array [1, 2, 3, 4, 5].
Step 1:
Index: 0 1 2 3 4
Array: [1, 2, 3, 4, 5]
↑ ↑ ↑
left mid rightmid = (0 + 4) // 2 = 2 -> arr[2] = 3 < 4 -> move left
Step 2:
Index: 0 1 2 3 4
Array: [1, 2, 3, 4, 5]
↑ ↑ ↑
left mid rightmid = (3 + 4) // 2 = 3 -> arr[3] = 4 == 4 -> found at index 3
Visualization
Binary Search
Binary Search is an efficient algorithm for finding a target value in a sorted array. It works by repeatedly dividing the search interval in half, comparing the middle element with the target, and eliminating half of the remaining elements.
Time Complexity
- Best Case: O(1) - Element found at middle
- Average Case: O(log n) - Halves search space each time
- Worst Case: O(log n) - Element at end or not present
Space Complexity
O(1) - No extra space required (iterative)
Requirements
The array must be sorted. Binary search is much faster than linear search for large datasets.