nodejs2 Min Read

How to check whether a website link has your URL backlink or not - NodeJs implementation

Gorav Singal

April 08, 2020

TL;DR

Read a list of URLs from a text file and use Node.js to fetch each page, parse the HTML, and check if it contains your website's backlink.

How to check whether a website link has your URL backlink or not - NodeJs implementation

Introduction

I got my seo backlink work done from a freelancer. It was like 3000 links, and usually the links that freelancer provides are broken. So, I wanted to really test each single of them to check if those URLs are actually active and having my url ot backlink.

NodeJs automation

I wrote a simple nodejs automation which read list of urls from a text file, and one by one check the validity of url and backlink.

Input

  1. A text file having list of urls
  2. My website name: xyz.com

Code

Following is the directory structure:

project
    - app.js
    - src/http/url_checker.js
    - package.json

package.json

{
  "name": "check_links_seo",
  "version": "1.0.0",
  "description": "For checking link validity work given by freelancers",
  "main": "app.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "Gorav Singal",
  "license": "ISC",
  "dependencies": {
    "async": "^3.2.0",
    "cheerio": "^1.0.0-rc.3",
    "request": "^2.88.2",
    "request-promise": "^4.2.5"
  }
}

app.js

const urlChecker = require('./src/http/url_checker');
const fs = require('fs');

const urls = fs.readFileSync('urls.txt').toString().split('\n');

//remember to put your website here
const myWeb = 'XYZ.com';

return urlChecker.checkYourLinkInUrls(urls, myWeb)
    .then(() => {
        console.log('Successful finished...');
    })
    .catch(err => {
        console.error(err);
    });

url_checker.js

const rp = require('request-promise');
const cheerio = require('cheerio');
const async = require('async');

class UrlChecker {
    checkYourLinkInUrls(urls, desiredWebsite) {
        return new Promise((resolve, reject) => {
            async.eachLimit(urls, 1, (url, callback) => {
                return this.__checkYourLinkInUrl(url, desiredWebsite)
                    .then(function (res) {
                        if (!res) {
                            console.log('failed', url);
                        }
                        else {
                            console.log('success', url);
                        }
                        callback();
                    }).catch(function (err) {
                        callback(err);
                    });
            }, function (err) {
                if (err) {
                    reject(err);
                } else {
                    resolve();
                }
            });
        });
    }

    __checkYourLinkInUrl(url, desiredWebsite) {
        // console.log('Checking url: ', url);
        return rp(url)
            .then(html => {
                return html.indexOf(desiredWebsite) > -1;
                // const $ = cheerio.load(html);
                // const links = $('a');

                // let found = false;
                // $(links).each(function(i, link){
                //     const web = $(link).attr('href');
                //     console.log(web);
                //     // console.log($(link).text() + ':\n  ' + $(link).attr('href'));
                //     if (web.startsWith(desiredWebsite)) {
                //         found = true;
                //         return found;
                //     }
                // });
                // // console.log($(links));
                // return found;
            })
            .catch(err => {
                // console.error('Error in url', url, err);
                return false;
            });
    }
}

module.exports = new UrlChecker();

Note: In above code, I’m just checking whether given web page is having my website or not. And in commented code, I’ve also checked for actual links. But, this code is bit expensive in computation as well as memory.

Run code

node app.js

Thanks for reading…

Share

Related Posts

Nodejs - Json object schema validation with Joi

Nodejs - Json object schema validation with Joi

Introduction In this post, I will show how to validate your json schema…

Mongoose - Using CRUD operations in mongodb in nodejs

Mongoose - Using CRUD operations in mongodb in nodejs

MongoDB CRUD Operations Mongoose provides a simple schema based solution to…

How to connect to mysql from nodejs, with ES6 promise

How to connect to mysql from nodejs, with ES6 promise

Introduction I had to develop a small automation to query some old mysql data…

How to generate powerful tags for your content - SEO

How to generate powerful tags for your content - SEO

One of the biggest task while writing article or blog is to have right set of…

WebSockets with Socket.io in Node.js

WebSockets with Socket.io in Node.js

WebSocket vs HTTP Traditional HTTP follows a request/response model — the client…

Testing Node.js — Unit, Integration, and E2E

Testing Node.js — Unit, Integration, and E2E

Testing Strategy A solid testing strategy follows the testing pyramid — many…

Latest Posts

AI Video Generation in 2025 — Models, Costs, and How to Build a Cost-Effective Pipeline

AI Video Generation in 2025 — Models, Costs, and How to Build a Cost-Effective Pipeline

AI video generation went from “cool demo” to “usable in production” in 2024-202…

AI Models in 2025 — Cost, Capabilities, and Which One to Use

AI Models in 2025 — Cost, Capabilities, and Which One to Use

Choosing the right AI model is one of the most impactful decisions you’ll make…

AI Image Generation in 2025 — Models, Costs, and How to Optimize Spend

AI Image Generation in 2025 — Models, Costs, and How to Optimize Spend

Generating one image with AI costs between $0.002 and $0.12. That might sound…

AI Coding Assistants in 2025 — Every Tool Compared, and Which One to Actually Use

AI Coding Assistants in 2025 — Every Tool Compared, and Which One to Actually Use

Two years ago, AI coding meant one thing: GitHub Copilot autocompleting your…

AI Agents Demystified — It's Just Automation With a Better Brain

AI Agents Demystified — It's Just Automation With a Better Brain

Let’s cut through the noise. If you read Twitter or LinkedIn, you’d think “AI…

Supply Chain Security — Protecting Your Software Pipeline

Supply Chain Security — Protecting Your Software Pipeline

In 2024, a single malicious contributor nearly compromised every Linux system on…