Binary search - Introduction

Binary search is a fundamental technique in the development of algorithms and an essential tool in your algorithmist toolkit.

In its most pure form, binary search is used to find an element in a sorted array. We will start by solving that problem. This is quite a straightforward thing to do but you will see that binary search is incredibly powerful and in the next tasks we will see how we can use it to solve problems that seem very complex at first.

But first things first, let's solve the following problem:

Given a sorted array \(a\) of \(n\) integers, how can we efficiently determine whether it contains a given target element \(t\)?

Let's say we check element at position \(i\). If \(a[i] = t\) then we can report that \(a\) contains \(t\) at position \(i\). Otherwise, we either have \(a[i] < t\) or \(a[i] > t\).

Since \(a\) is sorted, if \(a[i] < t\) then we know that all elements before index \(i\), that is, \(a[0], a[1], \ldots, a[i - 1]\) are also smaller than \(t\). Hence we can discard them and save us the trouble of comparing all of the against \(t\). In other words, we reduced the problem of finding \(t\) in \(a\) to the problem of finding \(t\) in \(a[i + 1], \ldots, a[n - 1]\).

On the other case, when \(a[i] > t\) we can apply an analogous reasoning to discard all elements after index \(i\).

Conclusion: by checking any index \(i\) we will either find \(i\) or be able to discard all indexes smaller than \(i\) or all indexes greater than \(i\). If we select \(i\) to be the middle index, then we are sure that, in the worse case, we can discard half of the elements of the array.

But then what do we do once we discarded half of the array?

We can repeat the same process recursively on the new smaller array and discard half again and so on until either we find \(t\) or we discard all elements of the array.

In order to avoid copying the sub arrays, we will simply keep track of the index of the first and last elements of the array. Call them \(L\) and \(R\), respectively. In the beginning we start with \(L = 0\) and \(R = n - 1\). The middle element can be computed as

\begin{equation*} M = (L + R) / 2 \end{equation*}

where the division stands for integer division (so that for instance 5 / 2 = 2 and not 2.5).

By what we said above, if \(a[M] < t\) then we can discard the lower half of the array. This can be achieve by setting \(L\) to \(M + 1\). If \(a[M] > t\) then we can discard the upper half of the array by setting \(R\) to \(M - 1\). We stop when the array becomes empty, that is, when \(L > R\), returning \(-1\) indicating that \(t\) is not an element of \(a\).

In code we get the following:

// iterative implementation
static int binSearch(int[] a, int t) {
    int L = 0;
    int R = a.length - 1;
    while(L <= R) {
        int M = (L + R) / 2;
        if(a[M] == t) return M;
        if(a[M] < t) L = M + 1;
        if(a[M] > t) R = M - 1;
    }
    return -1;
}

You can also implement it recursively. Depending on the person, some people find one easier to read than the other.

// recursive implementation
static int binSearch(int[] a, int t) {
    return binSearchAux(a, t, 0, a.length - 1);
}

static int binSearchAux(int[] a, int t, int L, int R) {
    if(L > R) return -1;
    int M = (L + R) / 2;
    if(a[M] < t) return binSearchAux(a, t, M + 1, R);
    if(a[M] > t) return binSearchAux(a, t, L, M - 1);
    return M;
}

Whichever you use, should not really matter except that maybe the recursive implementation may have some little overheard because of the recursion stack. In most problems that should not have any impact and you should use the one you feel more comfortable with.

This code runs in \(O(\log(n))\) because at each iteration the size of the interval is roughly cut in half and \(log(n)\) basically describes the number of times \(n\) needs to be halved to reach \(1\).

In most programming languages, this function is already implemented. For instance, in Java, you can use Arrays.binarySearch().


Find first occurence in sorted array

Given a sorted array \(a\) you will be given \(q\) queries each with an integer \(t\). For each query you need to find, if one exists, the minimum index \(i\) such that \(a[i] = t\).

Note that the code above does not necessarily find the minimum index. You will need to modify the code in order to achieve this.

Input

  • One line with two integers \(n\) and \(q\) giving the size of the array and the number of queries.
  • \(n\) integers separated giving the values \(a_i\) of \(a\).
  • \(q\) lines each with one integer \(t\) giving the query target.

Constraints

  • \(1 \leq n \leq 10^5\)
  • \(-2^{30} \leq a_i \leq 2^{30}\)
  • \(1 \leq q \leq 10^5\)
  • \(-2^{30} \leq t \leq 2^{30}\)

Output

\(q\) lines, each with a single integers giving the minimum index of \(t\) in \(a\) or \(-1\) if no such index exists.

Sample Test Cases

Sample input 1

Sample output 1

Sample input 2

Sample output 2


Max file size: 1.0MB
Allowed extensions: .java, .cpp, .py

Information

Deadline No deadline
Submission limit No limitation

Sign in