React-driven single-page applications (SPAs) are gaining momentum. Used by tech giants like Facebook, Twitter, and Google, React allows for building fast, responsive, and animation-rich websites and web applications with a smooth user experience.
Luckily, there are a couple of ready-made solutions for React that can help you achieve visibility in search engines. We talk about them below.
But before we dive deep into technical details of optimizing single-page applications (SPA), let’s deal with some useful theory to help you understand the challenges with SEO in React websites.
What’s a single-page app and why React?
A single-page application is a web application whose content is served in a single HTML page. This page is dynamically updated, but it doesn’t fully reload each time a user interacts with it. Instead of sending a new request to the server for each interaction and then receiving a whole new page with new content, the client side of a single-page web app only requests and loads data that needs to be updated.
Facebook, Twitter, Netflix, Trello, Gmail, and Slack are all examples of single-page apps.
From the developer’s perspective, SPAs keep the presentation layer (the front end) and the data layer (the back end) separate, so two teams can develop these parts in parallel. Also, with an SPA, it’s easier to scale and use a microservices architecture than it is with a multi-page app.
From the user’s perspective, single-page apps offer seamless interaction with a web app. Plus, these apps can cache data in local storage, meaning they’ll load even faster the second time.
Thanks to React’s component-based architecture, it’s easy to reuse code and divide a large app into smaller parts. Hence, maintaining and debugging large SPA projects is much easier than doing the same for large multi-page apps. The virtual DOM ensures high app performance. Also, the React library supports all modern browsers, including their older versions.
But as you might have guessed, React has disadvantages as well, and one of the biggest is the lack of SEO-friendliness.
Principles of search engine optimization
Search engine optimization (SEO) is the practice of increasing the quality and quantity of web traffic to a website by increasing its visibility on search engines.
The main task of SEO is to help a web app rank in search engines (Google, Bing) when the target audience searches for information by a certain keyword. Most SEO efforts focus on optimizing apps for Google. After all, as of 2020 Google has a 92.16% search engine market share according to StatCounter.
According to Backlinko, 75% of all user clicks go to the top three websites in search engine results, while websites from the second results page and beyond get only 0.78% of clicks. That’s why digital businesses are furiously fighting to get on this first page of search results.
For business owners, it’s crucial to think of an SEO optimization strategy from the very beginning of web app development to find an effective technology stack.
To determine a website’s ranking in search results, search engines use web crawlers. A web crawler is a bot that regularly visits web pages and analyzes them according to specific criteria set by the search engine. Each search engine has its own crawler, and Google’s crawler is called the Googlebot.
The Googlebot explores pages link by link, gathers information on website freshness, content uniqueness, number of backlinks, etc., downloads HTML and CSS files, and sends all of this data to Google servers. Then it’s analyzed and indexed by a system called Caffeine. This is a fully automatic process, so it’s vital to ensure that crawlers correctly understand the website content. And here’s where the problem appears.
What’s wrong with optimizing single-page applications (SPA) for search engines?
As you can see, there’s nothing except the <div> tag and an external script. Single-page applications need a browser to run a script, and only after the script is run will the content be dynamically loaded to the web page. So when a crawler visits the website, it sees an empty page without content. Hence, the page cannot be indexed.
If the content on a page updates frequently, crawlers should regularly revisit the page. This can cause problems, since reindexing may only be done a week later after the content is updated, as Google Chrome developer Paul Kinlan reports on Twitter.
Limited crawling budget
The crawl budget is the maximum number of pages on your website that a crawler can process in a certain period of time. Once that time is up, the bot leaves the site no matter how many pages it’s downloaded (whether that’s 26, 3, or 0). If each page takes too long to load because of running scripts, the bot will simply leave your website before indexing it.
Talking about other search engines, Yahoo’s and Bing’s crawlers will still see an empty page instead of dynamically loaded content. So getting your React-based SPA to rank at the top on these search engines is a will-o'-the-wisp.
You should think of how to solve this problem on the stage of designing app architecture.
Solving the problem
There are a couple of ways to make a React app SEO-friendly: by creating an isomorphic React app or by using prerendering.
Isomorphic React apps
Building an isomorphic app can be really time-consuming. Luckily, there are frameworks that facilitate this process. The two most popular solutions for SEO are Next.js and Gatsby.
Next.js is a framework that helps you create React apps that are generated on the server side quickly and without hassle. It also allows for automatic code splitting and hot code reloading. Next.js can do full-fledged server-side rendering, meaning HTML is generated for each request right when the request is made.
Gatsby is a free open-source compiler that allows developers to make fast and powerful websites. Gatsby doesn’t offer full-fledged server-side rendering. Instead, it generates a static website beforehand and stores generated HTML files in the cloud or on the hosting service. Let’s take a closer look at their approaches.
Gatsby vs Next.js
The challenge of SEO for GatsbyJS framework is solved by generating static websites. All HTML pages are generated in advance, during the development or build phase, and are then simply loaded to the browser. They contain static data that can be hosted on any hosting service or in the cloud. Such websites are very fast, since they aren’t generated at runtime and don’t wait for data from the database or API.
But data is only fetched during the build phase. So if your web app has any new content, it won’t be shown until another build is run.
This approach works for apps that don’t update data too frequently, i.e. blogs. But if you want to build a web app that loads hundreds of comments and posts (like forums or social networks), it’s better to opt for another technique.
The second approach is server-side rendering (SSR), which is offered by Next.js. In contrast with traditional client-side rendering, in server-side rendering, HTML is generated on the server, and then the server sends already generated HTML and CSS files to the browser. Also, with Next.js, HTML is generated each time a client sends a request to the server. This is vital when a web app contains dynamic data (as forums or social networks do). For SSR to work on React, developers also need to use the Node.js server, which can process all requests at runtime.
Server-side rendering with Next.js
The Next.js rendering algorithm looks as follows:
The Next.js server, running on Node.js, receives a request and matches it with a certain page (a React component) using a URL address.
The page can request data from an API or database, and the server will wait for this data.
The Next.js app generates HTML and CSS based on the received data and existing React components.
Making website SEO Friendly with GatsbyJS
The process of optimizing React applications is divided into two phases: generating a static website during the build and processing requests during runtime.
The build time process looks as follows:
Gatsby’s bundling tool receives data from an API, CMS, and file system.
During deployment or setting up a CI/CD pipeline, the tool generates static HTML and CSS on the basis of data and React components.
After compilation, the tool creates an about folder with an index.html file. The website consists of only static files, which can be hosted on any hosting service or in the cloud.
Request processing during runtime happens like this:
Creating an isomorphic app is considered the most reliable way to make React SEO-compatible, but it’s not the only option.
The idea of prerendering is to preload all HTML elements on the page and cache all SPA pages on the server with the help of Headless Chrome. One popular way to do this is using a prerendering service like prerender.io. A prerendering service intercepts all requests to your website and, with the help of a user-agent, defines whether a bot or a user is viewing the website.
If the viewer is a bot, it gets the cached HTML version of the page. If it’s a user, the single-page app loads normally.
Prerendering has a lighter server payload compared to server-side rendering. But on the other hand, most prerendering services are paid and work poorly with dynamically changing content.
Using Gatsby: A demo project
Now let’s check the audit result using Lighthouse. Its performance is close to 100, while SEO for Gatsby.js applications is measured as high as 90.
If this website were on the production server, all indicators would be approximately 100.
Stay tuned for updates. We’ll soon share the results of that experiment!
The bottom line
Single-page React applications offer exceptional performance, seamless interactions close to those of native applications, a lighter server payload, and ease of web development.
Challenges with SEO shouldn’t be a reason for you to avoid using the React library. Instead, you can use the above-mentioned solutions to fight this issue. Moreover, search engine crawlers are getting smarter every year, so in the future, SEO optimization may no longer be a pitfall of using React.