Saturday, February 15, 2025

Replace failed ZFS mirror drive in OPNSense

Welcome Back!

It's been a while since I last shared anything. I recently changed jobs and have been busy with that endeavor, but I hope to share more insights from this journey soon.

Encountering SMART Errors after OPNSense Upgrade

Upon upgrading one of my OPNSense instances, I noticed some errors upon restarting one of my drives, ada1. After further investigation, I came across some SMART errors. Although these errors were not enough to trigger a SMART failure, they were still concerning. Even manual short tests returned clean results. Here's what I found when running smartctl -a:

 ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
...  
180 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       4185
...
195 Hardware_ECC_Recovered  0x0032   100   099   000    Old_age   Always       -       1676229109
...
SMART Error Log Version: 1
ATA Error Count: 4216 (device log contains only the most recent five errors)
...	
Error 4216 occurred at disk power-on lifetime: 32126 hours (1338 days + 14 hours)
  When the command that caused the error occurred, the device was in an unknown state.
  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  00 00 00 00 00 00 00

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ec 00 00 00 00 00 00 00      00:00:09.590  IDENTIFY DEVICE
  ec 00 00 00 00 00 00 00      00:00:09.260  IDENTIFY DEVICE
  f5 00 00 00 00 00 00 00      00:00:09.250  SECURITY FREEZE LOCK
  ec 00 00 00 00 00 00 00      00:00:09.250  IDENTIFY DEVICE
  ec 00 00 00 00 00 00 00      00:00:09.250  IDENTIFY DEVICE
...

This drive is on its way out, so I need to replace it. This particular system is a retasked $30 Barracuda Load Balancer 340 that I upgraded with a new processor and memory. It works great for the use case. Unfortunately, it uses commodity hardware (an MSI customized mainboard), and its manual did not state it supports hotplug, so I had to bring it down to swap it out. Log into the console via your preferred method—I'm using SSH. The first task is to remove the failing drive from the ZFS pool after identifying it.

root@OPNsense:~ # zpool status

pool: zroot
 state: ONLINE
  scan: scan: scrub repaired 0B in 00:00:15 with 0 errors on Wed Feb  5 01:31:15 2025
config:
        NAME        STATE     READ WRITE CKSUM
        zroot       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            ada0p4  ONLINE       0     0     0
            ada1p4  ONLINE       0     0     0
errors: No known data errors 

Since ada1 is failing, the ZFS partition is ada1p4, so we will remove that partition.

 root@OPNsense:~ # zpool detach ada1p4 

Replacing the Drive

As this system does not support hotplug, I then shut it down and swapped the defective drive with a known good one of the same size or larger. The new drive should be clean without any partitions on it. However, in this case, as the second drive, OPNSense will boot back up from ada0, so it's not super important. Your results may vary. Once back up, log back into the console via your preferred method.

Verifying the Partitions

Firstly, we need to verify the partitions because copying the wrong ones could lead to trouble! If your new drive has a partition table this will show it.

root@OPNsense:~ # gpart show

=>       40  500118112  ada0  GPT  (238G)
         40     532480     1  efi  (260M)
     532520       1024     2  freebsd-boot  (512K)
     533544        984        - free -  (492K)
     534528   16777216     3  freebsd-swap  (8.0G)
   17311744  482805760     4  freebsd-zfs  (230G)
  500117504        648        - free -  (324K)

We have four partitions and the partition table to clone to the new disk. We use dd on partitions 1 and 2; however, partitions 3 and 4 are addressed via the relevant tools. Next, we need to turn off swap. Since both partitions are listed in /etc/fstab, we receive an error for the swap located on the now-missing disk.

root@OPNsense:~ # swapoff -a

swapoff: removing /dev/ada0p3 as swap device
swapoff: /dev/ada1p3: No such file or directory

Cloning Partitions

Now comes the potentially dangerous parts, so be VERY careful here. The source drive is ada0, and the new drive is ada1. We will clone the partition table from ada0 to ada1.

root@OPNsense:~ # gpart backup ada0 | gpart restore -F ada1

Next we clone partition 1:

root@OPNsense:~ # dd if=/dev/ada0p1 of=/dev/ada1p1
532480+0 records in
532480+0 records out
272629760 bytes transferred in 22.115694 secs (12327434 bytes/sec)

Then partition 2:

root@OPNsense:~ # dd if=/dev/ada0p2 of=/dev/ada1p2
1024+0 records in
1024+0 records out
524288 bytes transferred in 0.054477 secs (9623964 bytes/sec)

For the ZFS mirror, we use the zpool tool to attach it to the zroot pool as shown by zpool status above.

 root@OPNsense:~ # zpool attach zroot ada0p4 ada1p4

You can verify it’s back to an expected state via zpool status:

 root@OPNsense:~ # zpool status
  pool: zroot
 state: ONLINE
  scan: resilvered 2.29G in 00:00:10 with 0 errors on Sat Feb 15 10:14:39 2025
config:

        NAME        STATE     READ WRITE CKSUM
        zroot       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            ada0p4  ONLINE       0     0     0
            ada1p4  ONLINE       0     0     0

errors: No known data errors

Finally, turn swap back on, which will take care of the third partition.

 root@OPNsense:~ # root@OPNsense:~ # swapon -a
swapon: adding /dev/ada0p3 as swap device
swapon: adding /dev/ada1p3 as swap device

At this point, it would be a good idea to go to the GUI and navigate to System: Settings: Cron, and verify the SMART tasks are configured the way you want.

Along with any ZFS tasks. I only have a monthly scrub due to enabling autotrim per my config articles.


 


Friday, June 2, 2023

Prevent Inkjet Printer From Drying Out

Super quick post for this. My parents have a nice Multi-function inkjet that they barely use. A B&W Laser printer is a better choice but its what they got as they thought they would print pictures. I noticed they got a cleaning kit to try and get it back into use after drying out. After helping with that I wanted to come up with something else so they didn't have to buy another cleaning kit or other parts. 

I came up with a simple solution which is a scheduled task that runs weekly and prints out a small color page to keep the inkjet in good condition. Sure it wastes a little ink but compared to the cost of maintenance parts it's way cheaper and the printer is a refillable model.

To facilitate you simply run WordPad with a switch to print a file via a scheduled task. I created an RTF file with the following:


It has a little bit of color to keep the ink fresh but not enough to cost a lot. The cleaning kits would have you print out fill pages of color blocks. For the scheduled task it is super simple. Just start Task Scheduler via Start and create a Basic Task by right-clicking.


Give it some basic info and click Next


Click Weekly then Click Next


Set the schedule to run. In this case I chose Sunday at 3AM as its in their basement and I never run anything on the hour. Click Next


Select Start a program and then click Next

For the Program/script enter 

"C:\Program Files (x86)\windows nt\accessories\wordpad.exe"

and for the Add arguments enter

/p "C:\Path\to\PrintTestPage.RTF"

Then click Next

On the final dialog screen choose the checkbox for 'Open the Properties dialog for this task when I click Finish' and Click Finish.


After the Properties dialog open adjust the user as needed. In this case it is ran on a Windows server so I used a local account. For Windows 10/11 you can use your account or another account. I would shy away from using BUILTIN\SYSTEM or BUILTIN\Administrator as you do not need any elevated permissions to run it. The account that runs the task will need to be able to access the RTF file.



-Kevin

Thursday, May 25, 2023

OPNSense Configuration (Part 2 - Deploy-Config)

This is part two of a two-part series. 

As I mentioned in Part One, this configuration is written as two parts for a specific use case. First is a 'base-config' that has all common settings and part two covers settings that would be different between my friend's clients. Between the two parts, you can put together a fully functional OPNSense Layer 7 firewall with ZenArmour for personal or small business use. Just like with Part One, you can adjust as needed such as importing the config.

Base-Config Deployment Process

Put the downloaded Part-One config on a separate FAT32 USB stick as /conf/config.xml for import during install. Do not put on the install media.

Follow Part One to install OPNSense until the Initial Wizard via HTTPS step.

NOTE: During Boot from the install media press any key to run the configuration importer. Alternatively, the config can be imported via the GUI later. By importing the configuration it will install all plugins when you run selection 12 to update to latest which saves some time.


You would simply type the device name and it will import the configuration from part one. In this example, you simply enter 'da0'. Continue the boot and let it autoconfigure the networks. 

NOTE: If not running on Intel emX based NICs (such as igcX) you can modify the config file for interfaces before import as this will save time later. Boot the installer USB and it will state the device NICs.

TIP: search for ‘>em0<’ for example as older vlans could be ‘em0_vlan400’ for so including brackets will exclude the vlans for later replacement.

  • em0=LAN
  • em1=WAN
  • em2=WLAN

From another PC on the LAN goto HTTPS://192.168.1.1, login with root/opnsense

If the config was not imported at install, navigate to System: Configuration: Backups and restore backup.
Restart and then log in again

Navigate to SYSTEM: SETTINGS: GENERAL and set host info
Hostname= Hostname of FW
Domain= Domain of network
        Click Save

Navigate to INTERFACES: WAN
Adjust IPv4 Configuration Type for ISP if not DHCP
If Xfinity modem goto DHCP client configuration
Reject Leases From = 192.168.100.1 (customer-provided modem) 
                        Reject Leases From = 10.0.0.1 (Xfinity provided modem)
Optionally enable IPv6 configuration if ISP supports it and desired
If Xfinity
                        IPv6 Configuration Type=DHCPv6
                        DHCPv6 Client Configuration
                        Prefix delegation size=60
                        Send IPv6 prefix hint = checked
                        Use IPv4 connectivity=checked 

Navigate to INTERFACES: LAN
Adjust Static IPv4 configuration as needed
If IPv6 was enabled on WAN interface and it is desired on LAN
                        set IPv6 Configuration Type = Track Interface
Track IPv6 Interface
IPv6 Interface=WAN
IPv6 Previx ID=0
Click Save

Optionally delete WLAN if not used
Navigate to INTERFACES: ASSIGNMENTS
        Click Delete icon
        Click Save

Navigate to INTERFACES: WLAN
Adjust Static IPv4 configuration as needed 
If IPv6 was enabled on WAN interface and it is desired on WLAN
                        set IPv6 Configuration Type = Track Interface
Track IPv6 Interface
IPv6 Interface=WAN
IPv6 Prefix ID=1
Click Save

Configure DNS as needed
Optionally remove DNS over TLS for Cloudflare
Navigate to SERVICES: UNBOUND DNS: DNS over TLS
Delete the two CloudFlare entries
                Use System Nameservers = checked
                Navigate to SYSTEM: SETTINGS: GENERAL: NETWORKING
        Add DNS servers or enable DNS from DHCP/PPP on WAN

If Windows Domain is present
                Navigate to SERVICES: UNBOUND DNS: OVERRIDES: DOMAIN OVERRIDES
Click Plus icon to create
Domain = Windows AD Domain FQDN
IP address= Windows AD DC server IP Address
Description=Friendly name of AD Domain
Click Save
                Click Plus icon to create
Domain = Windows AD Domain reverse
example=1.168.192.in-addr-arpa
IP address= Windows AD server IP Address
Description=Friendly name of AD Domain
Click Save
Optionally on Windows AD DC Servers change their current upstream to this OPNSense LAN IP

Configure GEOIP Blocking
        Navigate to FIREWALL: ALIASES: GEOPIP SETTINGS
URL= https://download.maxmind.com/app/geoip_download?edition_id=GeoLite2-Country-CSV&license_key=My_License_key&suffix=zip
  More detail located here 
Click Apply
        NAVIGATE to FIREWALL: ALIASES: ALIASES:
Edit GeoIPBlock and adjust countries as needed

If AMD CPU based, change Thermal Sensors
        Navigate to SYSTEM: SETTINGS: MISCELLANEOUS
                Thermal Sensors 
                Hardware = AMD K8,K10 and K11 CPU on-die thermal sensor (amdtemp)

Navigate to SERVICES: VNSTAT: GENERAL
Enable vnStat daemon=checked
Interface=WAN
Click Save

Install and Configure ZenArmor
More details are located here
        Navigate to ZENARMOR: DASHBOARD
Agree to EULA
Click Proceed to start the setup Wizard
Click Next
Choose install 
Click radio button for Install a local Elasticsearch Database
If low spec then Mongodb will be only option
Click Install Database & Proceed
Click Done
Under Deployment mode select radio button for Routed Mode with native netmap driver
If using unusual NICs you may need to choose emulated netmap
Under Interface Selection choose LAN and WLAN if applicable
Click each interface then the >> button
Do not choose any VLAN interfaces, only the physical interface
Click Next
        On Cloud Reputation & Web Categorization tab
Local Domains Names to Exclude From Cloud Queries = Local domain and/or Windows AD Domain if present
        Click Next
        On Updates and Suppot click Next
        On Deployment size correct environment size
Hardware requirements are located here
        Click Next
        Click Finish

Optionally install Zenarmor license
        Navigate to ZENARMOR: DASHBOARD
        Click Upgrade to a Subscription at top
        Choose options
Change Root Password
Navigate to SYSTEM: ACCESS: USERS
Edit Root user and set password.
Click Save

Optionally Add additional users
Click Plus
Configure as required
Click Save

Enable System Notifications via Email
Navigate to  SERVICES: MONIT: SETTINGS
Enable Monit =Checked
Mail Server Address = Mail server IP
Mail Server Port = Mail Server required
Mail Server Username = Mail Server required
Mail Server Password = Mail Server required
Mail Server SSL Connection = Mail Server required
Navigate to Services: Monit: Alert Settings
Click Plus and configure as required
Enabled alert = checked
Recipient= email address for alerts
Not on = checked
Events=Nothing Selected
Mail format = Leave blank
Reminder=Leave blank
Description = Description as needed
                Click Save
Click Apply

Optionally adjust access protocols and ports
    Navigate to SYSTEM: SETTINGS: ADMINISTRATION
Change HTTPS TCP port as required
                Change SSH port as required

Optionally install Postfix to handle all emails for site
        Navigate to SYSTEM: FIRMWARE: PLUGINS
                Install is-postfix
                Navigate to SERVICES: POSTFIX and configure for the Email provider

Optionally install ACME Client for Lets Encrypt Certificate
        Navigate to SYSTEM: FIRMWARE:PLUGINS
                Install os-acme-client
                Navigate to SERVICES: ACME Client and configure Certificates

Optionally install DDNS Client
Navigate to SYSTEM: FIRMWARE:PLUGINS
Instal os-dyndns for legacy (more support but older)
Install os-ddclient for modern
        Navigate to SERVICES: DYNAMIC DNS and configure for DDNS Provider

Optionally configure UPS
Navigate to SERVICES: NUT: CONFIGURATION: GENERAL SETTINGS
Enable Nut= checked
Click Down arrow on UPS type and choose the relevant Type
For most brands use USBHID-Driver
Enable= Check
Navigate to SERVICES: NUT: DIAGNOSTICS and it should show stats from UPS.

Enable ZFS pool trim
        SSH to the OPNSense and at the command prompt type zpool autotrim=on zroot

Enable SMART tests on storage
        Navigate to LOBBY: DASHBOARD
        Under SMART status note the drive(s) with OK under Status
        Examples would be da0, nvme0
        Navigate to SYSTEM: SETTINGS CRON
        Click Plus to create new Cron entry
        Minutes = 5
        Hours = 2
        Day of the month = 2
        Command = Run SMART test (short)
        Parameters = /dev/drivename (/dev/da0 for example)
        Description = drivename Smart Test
        Click Save
        Click Plus and duplicate for remaining drives.

Any Firewall Rule customizations
Navigate to FIREWALL: RULES: INTERFACES
Examples:
            DNS redirection
            Remove/Disable the Floating This firewall Rule if required
WLAN to LAN for printer or Active Directory