Binary Search in JavaScript

binary search in javascript code

“Binary search” is a computer science term used in programming and software engineering. It is a way of determining if a value exists in an ordered data set. It’s considered a textbook algorithm. It is useful because it reduces the search space by half on each iteration.

Imagine we are asked to search for an integer in an array that is sorted in ascending order. The challenge is to return the target number’s index.

binary search

We could use a built in JavaScript function indexOf():

var search = function(nums, target) {
    var answer = nums.indexOf(target);
    console.log(answer);
    return answer;
};
console.log(search([-1,0,3,4,5,9,12], 5)); // returns '4'

But that is an abstraction, which can be slow and expensive. Instead, we should implement our own binary search code.

Binary Search Implementation

Binary search starts in the middle of a collection, and checks if that is the requested term. The initial middle position is determined by taking the length of the array and dividing it by two. If the array has an even number of elements, then the ‘halfway position’ is considered close enough. I use a JavaScript Math function to round down, making sure I am not left with a decimal floating number:

cursor =  Math.floor( ((left + right) / 2) ); // about the middle

If the search term is not found, the position (the cursor) is moved. Since the collection is ordered from smallest to largest we know which direction to move the search cursor. If the target is less than the current value, we shift one to the left. If the target is greater than the current value, we shift two to the right.

This continues until the target query is found. That sequence is ran inside of a loop. The loop repeats while the left boundary is less than (or equal to) the right. The search cursor’s index is changed by adjusting the left or right bounds. Once adjusted, the cursor is recalculated on the subsequent iteration.

var search = function(nums, target) {
    var left = 0; // first element, initial left boundary
    var right = nums.length;
    right = right - 1; // last element, initial right boundary
    var cursor = 0;
    while(left <= right){
        cursor =  Math.floor( ((left + right) / 2) ); // about the middle to start
        console.log(cursor + ": " + nums[cursor]);
        if(nums[cursor] === target){
            return cursor; // query found!
        }
        if(target < nums[cursor]){
            right = cursor - 1; // move cursor to the left by one
            console.log("cursor moved to the left  by one")
        }else{
            left = cursor + 1; // move cursor to the right by two
            console.log("cursor moved to the right by two")
        }
    }
    return "false"; // not found
};
console.log(search([-1,0,1,2,3,4,5,6,7,8,9,10,11,12], 10));

The above code starts to search in the central location, nums[6] == 5. Look at the output below to see how the cursor moves until it finds the target value of 10 (at a zero-based index of 11):

binary search output

This algorithm has an O(log n) runtime complexity. That Big-O notation describes how it would scale as the input increases. Imagine an array with thousands of elements. O(log n) means that the time to execute this code will increase linearly only as the number of input items increase exponentially. That means our code is performant and efficient.

Why did the computer scientist use binary search to find his lost pencil? Because he wanted to reduce the search space by half every time he checked a desk drawer!

Square root algorithm using binary search

Another implementation example of using binary search in a JavaScript function is to calculate the square root of a number. Now, of course, Javascript has a built in Math.sqrt() function that is highly performant compared to any custom code we will write. But, it is a good exercise to think about how this works.

The square root of a number is a value that, when multiplied by itself, gives the original number. To solve for it when given an input, we are going to take a guess at what the answer (square root) is, starting with itself divided by two (giving us the midpoint between it and zero). We multiply that midpoint by itself to check our guess. It will only be correct on the first iteration if the input is the number four (four divided by two is two; two times two is four). Otherwise, we will move the cursor (halving the search range) to either the left or the right, depending on if the square of our guess is larger or smaller than the original input.

The concept of “moving the cursor” helps to visualize how the binary search algorithm zeroes in on the square root by iteratively narrowing down the search range based on the comparison of the square of the midpoint to the target number.

I accounted for two special cases in my code. For negative numbers we return false because negative numbers have imaginary square roots. For numbers smaller than one we set the range’s end to one because the square root of a fraction is larger than the fraction itself. Additionally, I set a precision threshold (0.0000000001), because some numbers have square roots that are irrational.

function binarySearchSquareRoot(n, precision = 1e-10) {
    if (n < 0) {
        throw new Error('Input must be a non-negative number');
    }

    let start = 0;
    let end = n;

    // If the number is less than 1, set end to 1 to handle fractions
    if (n < 1) {
        end = 1;
    }

    while (end - start > precision) {
        const mid = (start + end) / 2;
        const midSquared = mid * mid;

        if (midSquared === n) {
            return mid;  // Found exact square root
        } else if (midSquared < n) {
            start = mid;
        } else {
            end = mid;
        }
    }

    return (start + end) / 2;  // Return approximate square root
}

// Usage:
const number = 25;
const squareRoot = binarySearchSquareRoot(number);
console.log(`The square root of ${number} is approximately ${squareRoot}`);

Alternatively, the Babylonian algorithm can be  implemented to solve for square roots.

Additional References

https://delboy1978uk.wordpress.com/2018/02/06/binary-search-trees-in-php/
https://research.google/blog/extra-extra-read-all-about-it-nearly-all-binary-searches-and-mergesorts-are-broken/

Redirecting Traffic from EC2 Public DNS to a Domain Name

301 redirect from hostname

I searched Google for my own website, and noticed that some results were returning with my EC2 instance’s public hostname: http://ec2-18-191-145-67.us-east-2.compute.amazonaws.com/ – That’s a bad look for branding, and I figured it couldn’t be good for SEO. I fixed it by adding a new rewrite condition to my apache server’s httpd.conf file. The server runs on a Linux 2 image, and I found that file in this directory: /etc/httpd/conf/.

Initial Configuration

I already had a Virtual Host block that listens on port 80, responsible for making sure all traffic going through my domain name goes to the https://www.antpace.com. I just had to add the public DNS to the ServerAlias statement and add an additional RewriteCond statement:

    
<VirtualHost *:80>
    DocumentRoot "/var/www/html"
    ServerName "antpace.com"
    ServerAlias "www.antpace.com" "ec2-18-191-145-67.us-east-2.compute.amazonaws.com"
    RewriteEngine on
    RewriteCond %{HTTP_HOST} ^ec2-18-191-145-67.us-east-2.compute.amazonaws.com$ [OR]
    RewriteCond %{SERVER_NAME} =antpace.com [OR]
    RewriteCond %{SERVER_NAME} =www.antpace.com
    RewriteRule ^ https://www.antpace.com%{REQUEST_URI} [END,NE,R=permanent]
</VirtualHost>

Handling SSL Traffic

After restarting the server, I did another search, and saw some pages still returning with the SSL encrypted address https://ec2-18-191-145-67.us-east-2.compute.amazonaws.com/.

SERP

To handle traffic going there, I needed to add another Virtual Host block that would listen on port 443. You’ll notice that I had to reference my SSL certificates that were installed via Let’s Encrypt.

<VirtualHost *:443>
    DocumentRoot "/var/www/html"
    ServerName "antpace.com"
    ServerAlias "www.antpace.com" "ec2-18-191-145-67.us-east-2.compute.amazonaws.com"
    
    SSLEngine on
    SSLCertificateFile /etc/letsencrypt/live/antpace.com/fullchain.pem
    SSLCertificateKeyFile /etc/letsencrypt/live/antpace.com/privkey.pem
    SSLCertificateChainFile /etc/letsencrypt/live/antpace.com/chain.pem
    
    RewriteEngine on
    RewriteCond %{HTTP_HOST} ^ec2-18-191-145-67.us-east-2.compute.amazonaws.com$ [OR]
    RewriteCond %{HTTP_HOST} ^antpace.com$ [OR]
    RewriteCond %{HTTP_HOST} ^www.antpace.com$
    RewriteRule ^ https://www.antpace.com%{REQUEST_URI} [END,NE,R=permanent]
</VirtualHost>

Unfortunately, adding this block broke the site temporarily. It resulted in a redirect loop, in tandem with the previous block. I tried to adjust them both, resulting in the following:

<VirtualHost *:80>
    DocumentRoot "/var/www/html"
    ServerName "antpace.com"
    ServerAlias "www.antpace.com" "ec2-18-191-145-67.us-east-2.compute.amazonaws.com"

    RewriteEngine on
    # Redirect all HTTP traffic to HTTPS
    RewriteCond %{HTTPS} off
    RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
</VirtualHost>

<VirtualHost *:443>
    DocumentRoot "/var/www/html"
    ServerName "antpace.com"
    ServerAlias "www.antpace.com" "ec2-18-191-145-67.us-east-2.compute.amazonaws.com"

    SSLEngine on
    SSLCertificateFile /etc/letsencrypt/live/antpace.com/fullchain.pem
    SSLCertificateKeyFile /etc/letsencrypt/live/antpace.com/privkey.pem
    SSLCertificateChainFile /etc/letsencrypt/live/antpace.com/chain.pem

    RewriteEngine on
    # Redirect all non-preferred hostnames to the preferred hostname
    RewriteCond %{HTTP_HOST} ^ec2-18-191-145-67.us-east-2.compute.amazonaws.com [NC,OR]
    RewriteCond %{HTTP_HOST} ^antpace.com [NC]
    RewriteRule ^ https://www.antpace.com%{REQUEST_URI} [L,R=301]
</VirtualHost>

Although this configuration resolved the redirect loop, it failed to address the original issue. I discovered that my system had a separate configuration file for SSL traffic: /etc/httpd/conf.d/ssl.conf. I added the redirect rule and certificate references to the 443 Virtual Host block within this file, but the redirect issue persisted.

Diagnosing the Issue

Executing sudo apachectl -S revealed a conflicting Virtual Host on port 443 in httpd-le-ssl.conf. I moved the rewrite rules to this file, yet the redirect remained elusive. Suspecting a browser cache issue, I verified the redirect using curl:

301 redirect via CLI CURL

The 301 Moved Permanently response confirmed the redirect to https://www.antpace.com. Since the redirection works when tested with curl -k but not in a browser, the issue likely stems from the browser not accepting the SSL certificate due to the mismatch. Browsers are more strict with SSL certificate validation than curl -k.

I attempted to add the public host name as a subject alternative name (SAN) in my existing SSL certificate, but Let’s Encrypt refuses to issue certificates for EC2 hostnames. This restriction is due to Let’s Encrypt’s policy to prevent abuse. To circumvent this, I generated a self-signed certificate:

sudo mkdir -p /etc/ssl/private 
sudo mkdir -p /etc/ssl/certs
sudo chmod 700 /etc/ssl/private
sudo openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /etc/ssl/private/ec2-selfsigned.key -out /etc/ssl/certs/ec2-selfsigned.crt

Final Configuration

I created a new Apache configuration file for the EC2 redirection and ensured no conflicts with existing configurations:

sudo nano /etc/httpd/conf.d/ec2-redirect.conf

Here’s the contents:

<IfModule mod_ssl.c>
<VirtualHost *:443>
    DocumentRoot "/var/www/html"
    ServerName "ec2-18-191-145-67.us-east-2.compute.amazonaws.com"

    SSLEngine on
    SSLCertificateFile /etc/ssl/certs/ec2-selfsigned.crt
    SSLCertificateKeyFile /etc/ssl/private/ec2-selfsigned.key

    RewriteEngine on
    # Redirect EC2 URL to the preferred domain
    RewriteRule ^ https://www.antpace.com%{REQUEST_URI} [L,R=301]
</VirtualHost>
</IfModule>

Upon restarting Apache, the 301 redirect functioned correctly in the browser, albeit with an initial security warning due to the self-signed certificate. Accepting the certificate led me to the desired URL. This project underscored the importance of meticulous configuration and validation in managing web traffic and ensuring optimal SEO performance.

Pro-tip: Each time I had to edit any of these sever configuration files, I did so from the command line using the Nano text editor. I learned along the way that I could select multiple lines of text by pressing “Ctrl + Shift + 6”, pressing the arrows to expand the selection, and then “Ctrl + K” to remove content.

Alternative Approaches

To stop this problem from arising, even to begin with, we can take an alternative approach. It is possible, when creating a new EC2 instance, to disable the auto-assign public IP address in the network settings. This will launch the instance without a public IP address.

Optimize Your Site’s Speed: Convert Images to WebP Format

upgrading image files

In a recent post I discussed auditing existing websites for potential performance enhancements. In the process I discovered issues with the AntPace.com Lighthouse estimates (a topic I covered briefly in a post about progressive web apps).

Lighthouse performance estimate

My performance only scored 50 out of 100. One “opportunity” (estimated to save 2.16s) was to “serve images in next-gen formats”. AVIF (based on the AV1 video codec) and WebP are considered next generation because of “superior compression and quality characteristics compared to their older JPEG and PNG counterparts“. I chose to upgrade my images WebP (instead of AVIF) mostly due to greater browser support.

Before I upload images to this site, I usually run them through TinyPNG. Sometimes, I’ll also do some manual pre-processing in the GIMP, like cropping or scaling images down. I know that there  must be a better way to do all of that through a proper CI pipeline – but for now that is my low tech solution. Now that I need to come up with a new process for next-gen image formats, it might be a good time to explore a more automated solution.

Properly Size an Image for Your Website

Site speed is important for user experience and SEO. Having an image with unnecessarily large dimensions eats up bandwidth and leads to slow load times. You can open an image file in the GIMP and use the scale tool to reduce the size while maintaining the ratio. Once it is reduced, you can copy and paste it to it’s own canvas and save the file.

I write more about using this image manipulation program in another article about design for programmers

Convert multiple images to next-gen WebP file format

For converting single images, manually, one at a time I could use the GIMP or an online service. I used a command line tool on my mac called webp tools to bulk upgrade all of the files in an image directory.

brew install webp

For a single file I can use this command:

cwebp -q 80 software-education.png -o software-education.webp

To hit all of my image files in that folder I ran loops over the existing file formats. You should change this to include any other file formats used in your project.

for i in *.jpg; do cwebp -q 80 "$i" -o "${i%.jpg}.webp";done
for i in *.png; do cwebp -q 80 "$i" -o "${i%.png}.webp";done

I saved them to a new folder, so that I could easily delete all of the old files at once and then replace them. Alternatively, I can save them in the same directory and then run a command to delete all of the .png files: rm *.png

New folder amongst legacy image files

My website has been around for a while, so there are unused legacy media files. I manually searched my code base (ctrl + shift + f) for the file names listed in the various /image directories throughout my project, and deleted them if I found no references. (Maybe having so many disjointed image folders is part of a bigger problem with this project). If I did find it, I updated the file extension to webp. Later, I wrote a script to automatically find and delete unused image files from a project.

This change increased my Lighthouse performance score by ten points (I gained an additional two but upgrading MariaDB from version 10.2 to 10.4). I did this process is a few other directories of my project to complete the upgrade. I was going to convert my Apple touch icon files to a next-gen format too, but the documentation specify the use of the PNG format (good thing I checked).

WordPress and WebP

Having adopted WebP as the new standard file format for AntPace.com, I decided to upload .webp images to my posts – starting with this one. When I tried, WordPress complained: “This image cannot be processed by the web server. Convert it to JPEG or PNG before uploading.”

error from wordpress when trying to upload a next generation image file

This surprised me. I upgraded WordPress to the latest version (6.3.1 at that time), but it didn’t help. After further investigation, it looked like the problem had to do with the PHP image module gd.

I SSH’d into my EC2 instance to see what I could do. When I tried to install gd for my PHP version 7.4, I got a version mismatch error. It had to due with Amazon Linux 2 (the OS I run on AWS EC2) not supporting PHP 7.4 as it approaching or have passed their end-of-support dates. Every time I tried to installthe gd module, Amazon Linux Extras (the default Amazon Linux 2 package mechanism) would try to pull the version compatible with PHP 7.2. In an attempt to make it work, I manually disabled ‘amazon-linux-extras’, installed the Remi repository and made sure it was prioritized as my package manager. Still, “Packages skipped because of dependency problems”.

Amazon Linux 2 is at EOL.

The same thing happened when I tried using ImageMagick instead. This made me consider my Linux distribution. Not having gd installed what causing other problems when uploading media through WordPress (responsive image sizes are not being generated).

I had been wanting to upgrade the size of my EC2 instance anyway, so this might be the right time. I am considering Amazon Linux 2023 or a Bitnami image.  As you know, I’ll write a blog post about which I choose and the implementation details