• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer navigation

The Geek Diary

  • OS
    • Linux
    • CentOS/RHEL
    • Solaris
    • Oracle Linux
    • VCS
  • Interview Questions
  • Database
    • oracle
    • oracle 12c
    • ASM
    • mysql
    • MariaDB
  • DevOps
    • Docker
    • Shell Scripting
  • Big Data
    • Hadoop
    • Cloudera
    • Hortonworks HDP

Troubleshooting Solaris IPMP

by admin

How to configure Solaris 10 Probe based IPMP
How to configure Solaris 10 Link Based IPMP

Solaris IP multipathing provides the high availability and load balancing capability to the networking stack. It makes sure to avoid any single point of failure on network side. We may face issues while configuring and even after configuring IPMP. Below are some tips and tricks to troubleshoot issues in solaris IPMP configuration.

Testing IPMP failover

We can check the failure and repair of an interface very easily using if_mpadm command. “-d” detaches the interface whereas “-r” reattaches it.

# if_mpadm -d ce0
# if_mpadm -r ce0

Check if in.mpathd daemon is running

in.mpathd deamon is responsible to detect and repair IPMP failures. Check if the process is running on the system :

# ps -ef | grep mpath
    root  2222     1   0 20:41:10 ?           0:06 /usr/lib/inet/in.mpathd

In case its not running simply run the below command to start it :

# /usr/lib/inet/in.mpathd

To make in.mpathd daemon to re-read the /etc/default/mpathd configuration file after you do any changes to it use :

# pkill -HUP in.mpathd

Check the messages file

First and foremost thing to do is to check the /var/adm/messages file and look for mpathd related errors. You may find different errors ( as well as messages ) related to IPMP as shown below. The errors in the messages file can easily tell you the problem in the IPMP configuration.

1. interfaces configured for IPMP showing as "FAILED" in "ifconfig -a" output
2. "Successfully failed over from NIC xxxx to NIC xxxx", "NIC repair detected on " "Successfully failed back to NIC ", "The link has come up on ", "The link has gone down on " 
3. "No test address configured on interface disabling probe-based failure detection on it"
4. "Test address address is not unique; disabling probe based failure detection on "

Check the Flags in ifconfig command output

The ifconfig -a command output displays the various flags related to IPMP and interface configuration.

1. interfaces configured for IPMP missing the "UP" and/or "RUNNING" flag in the ifconfig -a output
2. interfaces configured for IPMP showing as "FAILED" in "ifconfig -a" output

The various flags related to IPMP and their meanings are :

deprecated -> can only be used as test address for IPMP and not for any actual data transfer by applications.
-failover -> does not failover when the interface fails
standby -> makes the interface to be used as standby

In the case interface is not showing the RUNNING flag, Check the output of any of the below commands to ensure that you have a working link between server and switch port.

# ndd -get /dev/[interface] adv_autoneg_cap      -- make sure you have set the interface first before getting the auto neg property value
# kstat -p |grep e1000g:0 |grep auto
# dladm show-dev

Ensure that the switchport is set to auto-negotiate. Disconnect and reconnect the ethernet from server side to renegotiate link speed with the switchport.

In the case interface is not showing the UP flag use :

# ifconfig [interface in group] up

Determine if the default router is properly answering ICMP probes

Probe based IPMP will use any on-link routers to send ICMP probes to and listen for responses. We can monitor the snoop command output to ensure that the onlink router is responding to the pings. The in.mpathd daemon uses test addresses to exchange ICMP probes, also called probe traffic, with other targets on the IP link. Probe traffic helps to determine the status of the interface and its NIC, including whether an interface has failed. The probes verify that the send and receive path to the interface is working correctly.

In the first window :

geeklab # snoop -d hme0 icmp
Using device /dev/hme (promiscuous mode)

In the second window :

geeklab # ping 192.168.1.1
192.168.1.1 is alive

Here 192.168.1.1 is the default router. You can check the default router in the netstat -nrv output.

Now in the first window you should be able to see the traffic :

geeklab -> 192.168.1.1  ICMP Echo request (ID: 1023 Sequence number: 0) 
192.168.1.1 -> geeklab        ICMP Echo reply (ID: 1023 Sequence number: 0)

Here the first line is the outgoing ICMP request (the “ping”) and the second line is the ICMP reply.

If you are using probe based IPMP ( an interface marked with -failover ), then use pkill to provide a debug snapshot from in.mpathd and check for “probes lost” messages output to /var/adm/messages:

# pkill -USR1 mpathd
# tail -20 /var/adm/messages

Are systems on the subnet able to respond to all-hosts multicast?

Use netstat and check for the interfaces’ membership in 224.0.0.1 :

geeklab # netstat -gn|grep 224.0.0.1
lo0 224.0.0.1 1
hme0 224.0.0.1 1

If the netstat -gn outputs show interfaces that cannot respond to ALL-SYSTEMS multicast (224.0.0.1), then add the host route using the route -p command.

Is VCS “Multi-NIC” In use with IPMP?

VCS uses a resource type called Multi-NIC to configure the IPMP using the solaris mpathd daemon. Make sure you are not using the VCS by checking /var/adm/messages file for VCS related errors.

# ps -ef|grep -i multi 
# grep -i LLT /var/adm/messages 
# grep -i GAB /var/adm/messages

If you are using VCS check the main.cf file for the configuration details and hastatus command to check if the MULTI-NIC resource is configured properly and is running fine.

Contact support with data

The last option, if everything fails is to contact the oracle support. Provide below data to oracle support for troubleshooting.

1. snoop

# snoop -d (first interface in the group) -o /tmp/ -s 60 -q
# snoop -d (second interface in the group) -o /tmp/ -s 60 -q

2. Explorer
Sun Explorer output :

# explorer     -- the command may vary with hardware

3. dladm

dladm show-dev > show-dev.out
dladm show-link > show-link.out
dladm show-aggr -L > show-aggr.out
How to configure Solaris 10 Probe based IPMP
How to configure Solaris 10 Link Based IPMP

Filed Under: Solaris

Some more articles you might also be interested in …

  1. How to Enable ssh/sshd Debugging for Solaris
  2. How To Delete Files on a ZFS Filesystem that is 100% Full
  3. How to change hostname in Solaris 8, 9 and 10
  4. Complete Hardware Reference : SPARC T5-2 / T5-4 / T5-8
  5. How to Import Zpool and Mount of BE When Booted From Alternate Device in Solaris 11
  6. Solaris Performance troubleshooting : Disk (I/O) performance issues
  7. “Warning: Missing charsets in String to FontSet conversion” – how to resolve the xclock warning message
  8. Solaris : How To Create and Mount NFS share that is Restricted to Certain Hosts
  9. Solaris : How to capture failed login attempts from tty logins (telnet, rlogin, and terminal)
  10. How to install a ZFS boot block in solaris

You May Also Like

Primary Sidebar

Recent Posts

  • “aws s3 ls” Command Examples
  • “aws s3 cp” Command Examples
  • “aws route53” Command Examples
  • “aws rds” Command Examples

© 2023 · The Geek Diary

  • Archives
  • Contact Us
  • Copyright