Mainnet

This is a tutorial on how to set up your Polygon Mainnet node.

We're going to assume you are already logged into your Virtual Machine as a privileged user or as the root user.

There are two binaries that will run the Polygon network:

  1. Heimdall

  2. Bor

AWS:

  • m5a.xlarge or any equivalent instance type

Bare Metal:

  • 16GB RAM

  • 4 vCPUs

  • At least 500 GB of storage - make sure it's extendable

Setup

First of all, we're going to need to install go:

wget https://raw.githubusercontent.com/maticnetwork/node-ansible/master/go-install.sh
bash go-install.sh
sudo ln -nfs ~/.go/bin/go /usr/bin/go

After the installation is done, we can proceed with setting up the binaries required for running an RPC node:

  • Heimdall Run the command:

cd ~/
git clone https://github.com/maticnetwork/heimdall
cd heimdall

# Checkout to a proper version
# For eg: git checkout v0.2.11-mainnet/testnet
git checkout <TAG OR BRANCH>
make install
  • Bor

Next, install the latest version of Bor.

cd ~/
git clone https://github.com/maticnetwork/bor
cd bor

# Checkout to a proper version
# i.e: git checkout 0.2.16

git checkout <TAG OR BRANCH>
make bor-all
sudo ln -nfs ~/bor/build/bin/bor /usr/bin/bor
sudo ln -nfs ~/bor/build/bin/bootnode /usr/bin/bootnode

To see if everything is in place, we can run the following commands:

heimdalld version
bor version

In heimdall case, after the Hard fork, you would need simply to run the following command to issue the configuration files structure:

heimdalld init --chain=mumbai/bor-mainnet --home /path/to/heimdall-data

Now that the binaries are installed, we're going to take care of some configuration:

Download service.sh file.

cd ~/node
wget https://raw.githubusercontent.com/maticnetwork/launch/master/<network-name>/service.sh

Generate the metadata file:

sudo mkdir -p /etc/matic
sudo chmod -R 777 /etc/matic/
touch /etc/matic/metadata

Generate services files and copy them into system directory:

cd ~/node
bash service.sh
sudo cp *.service /etc/systemd/system/
  • You will need to add a few details in the config.toml file. To open the config.toml file, run the following command: vi ~/.heimdalld/config/config.toml

Now in the config file you will have to change Moniker and add seeds information

moniker=<enter unique identifier> For example, moniker=my-sentry-node
seeds="[email protected]:26656,[email protected]:26656"
  • Change the value of Pex to true

  • Change the value of Prometheus to true

  • Set the max_open_connections value to 100

Next you need to make changes in the start.sh file for Bor. Add the following flag in vi ~/node/bor/start.sh to the bor start params:

--bootnodes "enode://0cb82b395094ee4a2915e9714894627de9ed8498fb881cec6db7c65e8b9a5bd7f2f25cc84e71e89d0947e51c76e85d0847de848c7782b13c0255247a6758178c@44.232.55.71:30303,enode://88116f4295f5a31538ae409e4d44ad40d22e44ee9342869e7d68bdec55b0f83c1530355ce8b41fbec0928a7d75a5745d528450d30aec92066ab6ba1ee351d710@159.203.9.164:30303"

To enable Archive mode you can add the following flags in the start.sh file

--gcmode 'archive' \
--ws --ws.port 8546 --ws.addr 0.0.0.0 --ws.origins '*' \

NOTE: It is worth mentioning that running full nodes will suffice for blast participation. You don't necessarily have to run an archive node.

Finally, we can start the services:

  • To Start Heimdall Service

sudo service heimdalld start
  • To start Heimdall Rest-server

sudo service heimdalld-rest-server start
  • To check if Heimdall is synced

curl localhost:26657/status

In the output, catching_up value should be false.

Now once Heimdall is synced, run:

sudo service bor start

That's pretty much it. You now have a running Polygon Mainnet RPC node. All you need to do now is wait for it to sync. You can check if the node is synced by running the API Call listed below from inside your environment. You are going to need to have the curl and jq packages installed for this, so make sure to install them beforehand.

curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "eth_syncing","params": []}' localhost:8545

If the result is false, it means that your node is fully synced.

Another way to check which block the node is at would be running:

curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "eth_blockNumber","params": []}' localhost:8545

The result should be a hex number (i.e 0x10c5815). If you convert it to a decimal number, you can compare it to the latest block listed on the Polygon Mainnet explorer: https://polygonscan.com/

The Polygon node exports both RPC on port 8545 and WS on port 8546.

In order to test the WS endpoint, we will need to install a package called node-ws.

An example WS call would look like this:

wscat --connect ws://localhost:8546
> {"id":1, "jsonrpc":"2.0", "method": "eth_blockNumber","params": []}

Monitoring Guidelines

In order to maintain a healthy node that passes the Integrity Protocol's checks, you should have a monitoring system in place. Blockchain nodes usually offer metrics regarding the node's behaviour and health - a popular way to offer these metrics is Prometheus-like metrics. The most popular monitoring stack, which is also open source, consists of:

  • Prometheus - scrapes and stores metrics as time series data (blockchain nodes cand send the metrics to it);

  • Grafana - allows querying, visualization and alerting based on metrics (can use Prometheus as a data source);

  • Alertmanager - handles alerting (can use Prometheus metrics as data for creating alerts);

  • Node Exporter - exposes hardware and kernel-related metrics (can send the metrics to Prometheus).

We will assume that Prometheus/Grafana/Alertmanager are already installed (we will provide a detailed guide of how to set up monitoring and alerting with the Prometheus + Grafana stack at a later time; for now, if you do not have the stack already installed, please follow this official basic guide here).

We recommend installing the Node Exporter utility since it offers valuable information regarding CPU, RAM & storage. This way, you will be able to monitor possible hardware bottlenecks, or to check if your node is underutilized - you could use these valuable insights to make decisions regarding scaling up/down the allocated hardware resources.

Below, you can find a script that installs Node Exporter as a systemd service.

#!/bin/bash

# set the latest version
VERSION=1.3.1

# download and untar the binary
wget https://github.com/prometheus/node_exporter/releases/download/v${VERSION}/node_exporter-${VERSION}.linux-amd64.tar.gz
tar xvf node_exporter-*.tar.gz
sudo cp ./node_exporter-${VERSION}.linux-amd64/node_exporter /usr/local/bin/

# create system user
sudo useradd --no-create-home --shell /usr/sbin/nologin node_exporter

# change ownership of node exporter binary
sudo chown node_exporter:node_exporter /usr/local/bin/node_exporter

# remove temporary files
rm -rf ./node_exporter*

# create systemd service file
cat > /etc/systemd/system/node_exporter.service <<EOF
[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target
[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter
[Install]
WantedBy=multi-user.target
EOF

# enable the node exporter service and start it
sudo systemctl daemon-reload
sudo systemctl enable node_exporter.service
sudo systemctl start node_exporter.service

As a reminder, Node Exporter uses port 9100 by default, so be sure to expose this port to the machine which holds the Prometheus server. The same should be done for the metrics port(s) of the blockchain node (in this case, we should expose ports 7071 - for monitoring the bor - and 26660 - for monitoring the heimdall).

Having installed Node Exporter and having already exposed the node's metrics, these should be added as targets under the scrape_configs section in your Prometheus configuration file (i.e. /etc/prometheus/prometheus.yml), before reloading the new config (either by restarting or reloading the config - please check the official documentation). This should look similar to this:

scrape_configs:
  - job_name: 'bor-node'
    scrape_interval: 10s
    metrics_path: /debug/metrics/prometheus
    static_configs:
      - targets:
        - '<NODE0_IP>:7071'
        - '<NODE1_IP>:7071' # you can add any number of nodes as targets
  - job_name: 'heimdall-node'
    scrape_interval: 10s
    static_configs:
      - targets:
        - '<NODE0_IP>:26660'
        - '<NODE1_IP>:26660' # you can add any number of nodes as targets
  - job_name: 'polygon-node-exporter'
    scrape_interval: 10s
    metrics_path: /metrics
    static_configs:
      - targets:
        - '<NODE0_IP>:9100'
        - '<NODE1_IP>:9100' # you can add any number of nodes as targets

In the configuration file above, please replace:

  • <NODE0_IP> - node 0's IP

  • <NODE1_IP> - node 1's IP (you can add any number of nodes as targets)

  • ...

  • <NODEN_IP> - node N's IP (you can add any number of nodes as targets)

That being said, the most important metrics that should be checked are:

  • node_cpu_seconds_total - CPU metrics exposed by Node Exporter - for monitoring purposes, you could use the following expression:

    • 100 - (avg by (instance) (rate(node_cpu_seconds_total{job="polygon-node-exporter",mode="idle"}[5m])) * 100), which means the average percentage of CPU usage over the last 5 minutes;

  • node_memory_MemTotal_bytes/node_memory_MemAvailable_bytes - RAM metrics exposed by Node Exporter - for monitoring purposes, you could use the following expression:

    • (node_memory_MemTotal_bytes{job="polygon-node-exporter"} - node_memory_MemAvailable_bytes{job="polygon-node-exporter"}) / 1073741824, which means the amount of RAM (in GB) used, excluding cache/buffers;

  • node_network_receive_bytes_total - network traffic metrics exposed by Node Exporter - for monitoring purposes, you could use the following expression:

    • rate(node_network_receive_bytes_total{job="polygon-node-exporter"}[1m]), which means the average network traffic received, per second, over the last minute (in bytes);

  • node_filesystem_avail_bytes - FS metrics exposed by Node Exporter - for monitoring purposes, you could use the following expression:

    • node_filesystem_avail_bytes{job="polygon-node-exporter",device="<DEVICE>"} / 1073741824, which means the filesystem space available to non-root users (in GB) for a certain device <DEVICE> (i.e. /dev/sda or wherever the blockchain data is stored) - this can be used to get an alert whenever the available space left is below a certain threshold (please be careful how you choose this threshold: if you have storage that can easily be increased - for example, EBS storage from AWS, you can set a lower threshold, but if you run your node on a bare metal machine which is not easily upgradable, you should set a higher threshold just to be sure you are able to find a solution before it fills up);

  • up - Prometheus automatically generated metrics - for monitoring purposes, you could use the following expressions:

    • up{job="bor-node"}, which has 2 possible values: 1, if the node is up, or 0, if the node is down - this can be used to get an alert whenever the node goes down (i.e. it can be triggered at each restart of the node);

    • up{job="heimdall-node"}, which has 2 possible values: 1, if the node is up, or 0, if the node is down - this can be used to get an alert whenever the node goes down (i.e. it can be triggered at each restart of the node);

  • chain_head_block & tendermint_consensus_latest_block_height - metrics exposed by the bor and the heimdall - for monitoring purposes, you could use the following expressions:

    • increase(tendermint_consensus_latest_block_height{job="heimdall-node"}[1m]), which means how many blocks heimdall has been producing in the last 1 minute - this can be used to get an alert whenever the node has fallen behind by comparing with a certain threshold (you should start worrying if the difference is greater than 2-3 for the last 5 minutes);

    • increase(chain_head_block{job="bor-node"}[1m]), which means how many blocks bor has been producing in the last 1 minute - this can be used to get an alert whenever the node has fallen behind by comparing with a certain threshold (you should start worrying if the difference is greater than 2-3 for the last 5 minutes);

  • tendermint_p2p_peers & p2p_peers - metrics exposed by the bor and the heimdall components - for monitoring purposes, you could use the following expressions:

    • tendermint_p2p_peers{job="heimdall-node"}, which means the number of peers connected to the node on the heimdall side - this can be used to get an alert whenever there are less peers than a certain threshold for a certain period of time (i.e. less than 3 peers for 5 minutes);

    • p2p_peers{job="bor-node"}, which means the number of peers connected to the node on the bor side - this can be used to get an alert whenever there are less peers than a certain threshold for a certain period of time (i.e. less than 3 peers for 5 minutes).

You can use the above metrics to create both Grafana dashboards and Alertmanager alerts.

Please make sure to also check the Official Documentation and the Github Repository posted above in order to make sure you are keeping your node up to date.

Last updated