SPARC: The Boot PROMEach SPARC based system has a PROM (programmable read-only memory) chip with a program called the monitor. The monitor controls the operation of the system before the Solaris kernel is available. When a system is turned on, the monitor runs a quick self-test procedure to checks the hardware and memory on the system. If no errors are found, the system begins the automatic boot process. Note – Some older systems might require PROM upgrades before they will work with the Solaris system software. Contact your local service provider for more information. SPARC: The Boot Process The following table describes the boot process on SPARC based systems. Table 15–1 SPARC: Description of the Boot Process Boot Phase Description Boot PROM 1. The PROM displays system identification information and then runs self-test diagnostics to verify the system's hardware and memory. 2. Then, the PROM loads the primary boot program, bootblk, whose purpose is to load the secondary boot program (that is located in the ufs file system) from the default boot device. Boot Programs 3. The bootblk program finds and executes the secondary boot program, ufsboot, and loads it into memory. Boot Phase Description 4. After the ufsboot program is loaded, the ufsboot program loads the kernel. Kernel Initialization 5. The kernel initializes itself and begins loading modules by using ufsboot to read the files. When the kernel has loaded enough modules to mount the root (/) file system, the kernel unmaps the ufsboot program and continues, using its own resources. 6. The kernel creates a user process and starts the /sbin/init process, which starts other processes by reading the /etc/inittab file. init 7. The /sbin/init process starts the run control (rc) scripts, which execute a series of other scripts. These scripts (/sbin/rc*) check and mount file systems, start various processes, and perform system maintenance tasks. IA: The Boot Process The following table describes the boot process on IA based systems. Table 15–3 IA: Description of the Boot Process Boot Phase Description BIOS 1. When the system is turned on, the BIOS runs self-test diagnostics to verify the system's hardware and memory. The system begins to boot automatically if no errors are found. If errors are found, error messages are displayed that describe recovery options. The BIOS of additional hardware devices are run at this time. 2. The BIOS boot program tries to read the first physical sector from the boot device. This first disk sector on the boot device contains the master boot record mboot, which is loaded and executed. If no mboot file is found, an error message is displayed. Boot Programs 3. The master boot record, mboot, which contains disk information needed to find the active partition and the location of the Solaris boot program, pboot, loads and executes pboot. 4. The Solaris boot program, pboot loads bootblk, the primary boot program, whose purpose is to load the secondary boot program that is located in the ufs file system. 5. If there is more than one bootable partition, bootblk reads the fdisk table to locate the default boot partition, and builds and displays a menu of available partitions. You have a 30-second interval to select an alternate partition from which to boot. This step only occurs if there is more than one bootable partition present on the system. 6. bootblk finds and executes the secondary boot program, boot.bin or ufsboot, in the root (/) file system. You have a 5–second interval to Boot Phase Description interrupt the autoboot to start the Solaris Device Configuration Assistant. 7. The secondary boot program, boot.bin or ufsboot, starts a command interpreter that executes the /etc/bootrc script, which provides a menu of choices for booting the system. The default action is to load and execute the kernel. You have a 5–second interval to specify a boot option or to start the boot interpreter. Kernel initialization 8. The kernel initializes itself and begins loading modules by using the secondary boot program (boot.bin or ufsboot) to read the files. When the kernel has loaded enough modules to mount the root (/) file system, the kernel unmaps the secondary boot program and continues, using its own resources. 9. The kernel creates a user process and starts the /sbin/init process, which starts other processes by reading the /etc/inittab file. init 10. The /sbin/init process starts the run control (rc) scripts, which execute a series of other scripts. These scripts (/sbin/rc*) check and mount file systems, start various processes, and perform system maintenance tasks Extended Diagnostics: If diag-switch? and diag-level are set, additional diagnostics will appear on the system console. auto-boot?: If the auto-boot? PROM parameter is set, the boot process will begin. Otherwise, the system will drop to the ok> PROM monitor prompt, or (if sunmon-compat? and securitymode are set) the > security prompt. The boot process will use the boot-device and boot-file PROM parameters unless diagswitch? is set. In this case, the boot process will use the diag-device and diag-file. bootblk: The OBP (Open Boot PROM) program loads the bootblk primary boot program from the boot-device (or diag-device, if diag-switch? is set). If the bootblk is not present or needs to be regenerated, it can be installed by running the installboot command after booting from a CDROM or the network. A copy of the bootblk is available at /usr/platform/`arch -k`/lib/fs/ufs/bootblk ufsboot: The secondary boot program, /platform/`arch -k`/ufsboot is run. This program loads the kernel core image files. If this file is corrupted or missing, a bootblk: can't find the boot program or similar error message will be returned. kernel: The kernel is loaded and run. For 32-bit Solaris systems, the relevant files are: /platform/`arch -k`/kernel/unix /kernel/genunix For 64-bit Solaris systems, the files are: /platform/`arch -k`/kernel/sparcV9/unix /kernel/genunix As part of the kernel loading process, the kernel banner is displayed to the screen. This includes the kernel version number (including patch level, if appropriate) and the copyright notice. The kernel initializes itself and begins loading modules, reading the files with the ufsboot program until it has loaded enough modules to mount the root filesystem itself. At that point, ufsboot is unmapped and the kernel uses its own drivers. If the system complains about not being able to write to the root filesystem, it is stuck in this part of the boot process. The boot -a command singlesteps through this portion of the boot process. This can be a useful diagnostic procedure if the kernel is not loading properly. /etc/system: The /etc/system file is read by the kernel, and the system parameters are set. The following types of customization are available in the /etc/system file: moddir: Changes path of kernel modules. forceload: Forces loading of a kernel module. exclude: Excludes a particular kernel module. rootfs: Specify the system type for the root file system. (ufs is the default.) rootdev: Specify the physical device path for root. set: Set the value of a tuneable system parameter. If the /etc/system file is edited, it is strongly recommended that a copy of the working file be made to a well-known location. In the event that the new /etc/system file renders the system unbootable, it might be possible to bring the system up with a boot -a command that specifies the old file. If this has not been done, the system may need to be booted from CD or network so that the file can be mounted and edited. kernel initialized: The kernel creates PID 0 ( sched). The sched process is sometimes called the "swapper." init: The kernel starts PID 1 (init). init: The init process reads the /etc/inittab and /etc/default/init and follows the instructions in those files. Some of the entries in the /etc/inittab are: fs: sysinit (usually /etc/rcS) is: default init level (usually 3, sometimes 2) s#: script associated with a run level (usually /sbin/rc#) rc scripts: The rc scripts execute the files in the /etc/rc#.d directories. They are run by the /sbin/rc# scripts, each of which corresponds to a run level. Debugging can often be done on these scripts by adding echo lines to a script to print either a "I got this far" message or to print out the value of a problematic variable. x86: Boot Process The following table describes the boot process on x86 based systems. Table 16-2 x86: Description of the Boot Process Boot Description Phase BIOS 1. When the system is turned on, the BIOS runs self-test diagnostics to verify the system's hardware and memory. The system begins to boot automatically if no errors are found. If errors are found, error messages are displayed that describe recovery options. The BIOS of additional hardware devices are run at this time. 2. The BIOS boot program tries to read the first disk sector from the boot device. This first disk sector on the boot device contains the master boot record mboot, which is loaded and executed. If no mboot file is found, an error message is displayed. Boot 3. The master boot record, mboot, contains disk information needed to Programs find the active partition and the location of the Solaris boot program, pboot, loads and executes pboot, mboot. 4. The Solaris boot program, pboot, loads bootblk, the primary boot program. The purpose of bootblk is to load the secondary boot program, which is located in the UFS file system. 5. If there is more than one bootable partition, bootblk reads the fdisk table to locate the default boot partition, and builds and displays a menu of available partitions. You have a 30 seconds to select an alternate partition from which to boot. This step occurs only if there is more than one bootable partition present on the system. 6. bootblk finds and executes the secondary boot program, boot.bin or ufsboot, in the root (/) file system. You have five seconds to interrupt the autoboot to start the Solaris Device Configuration Assistant. 7. The secondary boot program, boot.bin or ufsboot, starts a command interpreter that executes the /etc/bootrc script. This script provides a menu of choices for booting the system. The default action is to load and execute the kernel. You have a 5–second interval to specify a boot option or to start the boot interpreter. Kernel 8. The kernel initializes itself and begins loading modules by using the initializatio secondary boot program (boot.bin or ufsboot) to read the files. When n the kernel has loaded enough modules to mount the root ( /) file system, the kernel unmaps the secondary boot program and continues, using its own resources. 9. The kernel creates a user process and starts the /sbin/init process, which starts other processes by reading the /etc/inittab file. init 10. In this Oracle Solaris release, the /sbin/init process starts /lib/svc/bin/svc.startd, which starts system services that do the following: Check and mount file systems Configure network and devices Start various processes and perform system maintenance tasks In addition, svc.startd executes the run control (rc) scripts for compatibility. x86: Boot Files In addition to the run control scripts and boot files, there are additional boot files that are associated with booting x86 based systems. Table 16-3 x86: Boot Files File Description /etc/bootrc Contains menus and options for booting the Oracle Solaris release. /boot Contains files and directories needed to boot the system. /boot/mdboot DOS executable that loads the first-level bootstrap program (strap.com) into memory from disk. /boot/mdbootbp DOS executable that loads the first-level bootstrap program (strap.com) into memory from diskette. /boot/rc.d Directory that contains install scripts. Do not modify the contents of this directory. /boot/solaris Directory that contains items for the boot subsystem. /boot/solaris/boot.bin Loads the Solaris kernel or stand-alone kmdb. In addition, this executable provides some boot firmware services. /boot/solaris/boot.rc Prints the Oracle Solaris Operating OS on an x86 system and runs the Device Configuration Assistant in DOS-emulation mode. /boot/solaris/bootconf.exe DOS executable for the Device Configuration Assistant. /boot/solaris/bootconf.txt Text file that contains internationalized messages for Device Configuration Assistant (bootconf.exe). /boot/solaris/bootenv.rc Stores eeprom variables that are used to set up the boot environment. /boot/solaris/devicedb Directory that contains the master file, a database of all possible devices supported with realmode drivers. /boot/solaris/drivers Directory that contains realmode drivers. /boot/solaris/itup2.exe DOS executable run during install time update (ITU) process. /boot/solaris/machines Obsolete directory. /boot/solaris/nbp File associated with network booting. /boot/solaris/strap.rc File that contains instructions on what load module to load and where in memory it should be loaded. /boot/strap.com DOS executable that loads the second-level bootstrap program into memory. Note – rpc.bootparamd, which is usually a requirement on the server side for performing a network boot, is not required for a GRUB based network boot. The GRUB menu.lst file lists the contents of the GRUB main menu. The GRUB main menu lists boot entries for all the OS instances that are installed on your system, including Solaris Live Upgrade boot environments. The Solaris software upgrade process preserves any changes that you make to this file. Solaris 10 boot process : SPARC 1Share 17Share 1Tweet 1Share Solaris 10 boot process : SPARC Solaris 10 boot process : x86/x64 The boot process for SPARC platform involves 5 phases as shown in the diagram below. There is a slight difference in booting process of a SPARC based and x86/x64 based solaris operating system. Boot PROM phase 1. The boot PROM runs the power on self test (POST) to test the hardware. 2. The boot PROM displays the banner with below information – Model type – processor type – Memory – Ethernet address and host ID 3. Boot PROM reads the PROM variable boot-device to determine the boot device. 4. Boot PROM reads the primary boot program (bootblk) [sector 1 to 15] and executes it. Boot program phase 1. bootblk loads the secondary boot program ufsboot into memory. 2. ufsboot reads and loads the kernel. The kernel is composed of 2 parts : unix (platform specific kernel) genunix (platform independent kernel) 3. ufsboot combines these 2 kernel into one complete kernel and loads into memory. Kernel initialization phase 1. The kernel reads the configuration file /etc/system. 2. Kernel initializes itself and loads the kernel modules. The modules usually reside in /kernel and /usr/kernel directories. (Platform specific drivers in /platform/‘uname -i‘/kernel and /platform/‘uname -m‘/kernel directories) Init phase 1. Kernel starts the /etc/init daemon (with PID 1). 2. The /etc/init daemon starts the svc.startd process which is responsible for starting and stopping the services. 3. The /etc/init daemon uses a file called /etc/inittab to boot up the system to the appropriate run level mentioned in this file. Legacy Run Levels Run level specifies the state in which specific services and resources are available to users. 0 - system running PROM monitor (ok> prompt) s or S - single user mode with critical file-systems mounted.(single user can access the OS) 1 - single user administrative mode with access to all file-systems. (single user can access the OS) 2 - multi-user mode. Multiple users can access the system. NFS and some other network related daemons does not run 3 - multi-user-server mode. Multi user mode with NFS and all other network resources available. 4 - not implemented. 5 - transitional run level. Os is shutdown and system is powered off. 6 - transitional run level. Os is shutdown and system is rebooted to the default run level. svc.startd phase 1. After kernel starts the svc.startd daemon. svc.startd daemon executes the rc scripts in the /sbin directory based upon the run level. rc scripts Now with each run level has an associated script in the /sbin directory. # ls -l /sbin/rc? -rwxr--r-3 root -rwxr--r-1 root -rwxr--r-1 root sys sys sys 1678 Sep 20 2031 Sep 20 2046 Sep 20 2012 /sbin/rc0 2012 /sbin/rc1 2012 /sbin/rc2 -rwxr--r--rwxr--r--rwxr--r--rwxr--r-- 1 3 3 1 root root root root sys sys sys sys 1969 1678 1678 4069 Sep Sep Sep Sep 20 20 20 20 2012 2012 2012 2012 /sbin/rc3 /sbin/rc5 /sbin/rc6 /sbin/rcS Each rc script runs the corresponding /etc/rc?.d/K* and /etc/rc?.d/S* scripts. For example for a run level 3, below scripts will be executed by /sbin/rc3 : /etc/rc3.d/K* /etc/rc3.d/S* The syntax of start and stop run scripts is S##name_of_script - Start run control scripts K##name_of_scrips - Stop run control scripts Note the S and K in caps. Scripts starting with small s and k will be ignored. This can be used to disable a script for that particular run level. Solaris 10 boot process : SPARC Solaris 10 boot process : x86/x64 Solaris 10 boot process : x86/x64 1Share 1Share 1Tweet 1Share Solaris 10 boot process : SPARC Solaris 10 boot process : x86/x64 In the last post we saw the boot process in solaris 10 on SPARC platform. The boot process on x86/x64 hardware is bit different than the SPARC hardware. The x86/x64 hardware also involves the 5 step boot process, same as the SPARC hardware. Refer the flow diagram below. Boot PROM phase 1. The BIOS (Basic Input Output System) ROM runs the power on self test (POST) to test the hardware. 2. BIOS tries to boot from the device mentioned in the boot sequence. (We can change this by pressing F12 or F2). 3. When booting from the boot disk, BIOS reads the master boot program (mboot) on the first sector and the FDISK table. Boot program phase 1. mboot finds the active partition in FDISK table and loads the first sector containing GRUB stage1. 2. GRUB stage1 in-turn loads GRUB stage2. 3. GRUB stage2 locates the GRUB menu file /boot/grub/menu.lst and displays GRUB main menu. 4. Here user can select to boot the OS from partition or disk or network. 5. GRUB commands in /boot/grub/menu.lst are executed to load a pre-constructed primary boot archive (usually /platform/i86pc/boot_archive in solaris 10). 6. GRUB loads a program called multiboot, which assembles core kernel modules from boot_archive and starts the OS by mounting the root filesystem. Kernel initialization phase 1. The kernel reads the configuration file /etc/system. 2. Kernel initializes itself and loads the kernel modules. The modules usually reside in /kernel and /usr/kernel directories. (Platform specific drivers in /platform/‘uname -i‘/kernel and /platform/‘uname -m‘/kernel directories) Init phase 1. Kernel starts the /etc/init daemon (with PID 1). 2. The /etc/init daemon starts the svc.startd process which is responsible for starting and stopping the services. 3. The /etc/init daemon uses a file /etc/inittab. A sample inittab file looks like below : ap::sysinit:/sbin/autopush -f /etc/iu.ap sp::sysinit:/sbin/soconfig -f /etc/sock2path smf::sysinit:/lib/svc/bin/svc.startd>/dev/msglog 2<>/dev/msglog /dev/msglog 2<>/dev/msglog The init tab file as shown above contains four fields : id:rstate:action:process The process keyword specifies the process to execute for the action keyword. For example “/usr/sbin/shutdown -y -i5 -g0″ is the process to execute for the action “powerfail” Legacy Run Levels Run level specifies the state in which specific services and resources are available to users. Below are the run levels available in solaris : 0 - system running PROM monitor (ok> prompt) s or S - single user mode with critical file-systems mounted.(single user can access the OS) 1 - single user administrative mode with access to all file-systems. (single user can access the OS) 2 - multi-user mode. Multiple users can access the system. NFS and some other network related daemons does not run 3 - multi-user-server mode. Multi user mode with NFS and all other network resources available. 4 - not implemented. 5 - transitional run level. Os is shutdown and system is powered off. 6 - transitional run level. Os is shutdown and system is rebooted to the default run level. svc.startd phase 1. After kernel starts the svc.startd daemon. svc.startd daemon executes the rc scripts in the /sbin directory according to the run level. rc scripts Now with each run level has an associated script in the /sbin directory. # ls -l /sbin/rc? -rwxr--r-3 root -rwxr--r-1 root -rwxr--r-1 root -rwxr--r-1 root -rwxr--r-3 root -rwxr--r-3 root -rwxr--r-1 root sys sys sys sys sys sys sys 1678 2031 2046 1969 1678 1678 4069 Sep Sep Sep Sep Sep Sep Sep 20 20 20 20 20 20 20 2012 2012 2012 2012 2012 2012 2012 /sbin/rc0 /sbin/rc1 /sbin/rc2 /sbin/rc3 /sbin/rc5 /sbin/rc6 /sbin/rcS Each rc script runs the corresponding /etc/rc?.d/K* and /etc/rc?.d/S* scripts. For example for a run level 3, below scripts will be executed by /sbin/rc3 : /etc/rc3.d/K* /etc/rc3.d/S* The syntax of start and stop run scripts is S##name_of_script - Start run control scripts K##name_of_scrips - Stop run control scripts Note the S and K in caps. Scripts starting with small s and k will be ignored. This can be used to disable a script for that particular run level. GRUB booting on Solaris x86 Let's talk bit more about booting the Solaris on x86 architecture (NOT on SPARC) Now, on Solaris 10, the default boot loader is GRUB (GRand Unified Bootloader). The GRUB loads boot archive in system's memory. Okay so what's boor archive now? Well, simply it's bunch of critical files that's needed during boot time before / (root) file system is mounted. And these critical files are kernel modules and configuration files. Sun says, "Boot archive is interface that is used to boot Solaris OS". Remember, there is not boot archive on SPARC, only on x86 architecture. The GRUB has menu so you can select OS instance you want to boot. Sometimes you may need to perform next two tasks below and luckily, there are nice commands for that. I'll talk about them more later. Rebuild the boot archive Install the GRUB boot loader But let's do first thing first and this is overview of booting. The system is powered on. BIOS (yes this is x86, not SPARC) initializes CPU, memory and platform hardware BIOS loads boot loader (GRUB) from boot device GRUB takes control now GRUB shows menu with boot options (predefined in configuration file /boot/grub/menu.list) There are also options for edit command - press "e" or for CLI - press "c" o Example: Boot in a single user mode o Press e when GRUB main menu shows up o Move to kernel /platform/i86pc/multiboot line. It has additional options if you boot from ZFS. o Press e to edit this command. Prompt grub edit> shows up. o Type -s at the end of line. Press Enter brings you back to main menu. o Once in main menu, press b to boot in single user mode. o Same can be done for reconfigure boot with adding -r o If you need verbose for better troubleshooting, use -v GRUB boots primary boot archive (see menu.list file module /platform/i86pc/boot_archive ) and multiboot program (see this in menu.list file: kernel /platform/i86pc/multiboot ). Primary boot archive is file system image containing kernel modules and data. This goes into memory in this moment. Multiboot program is executable file and takes control from GRUB. Multiboot program reads boot archive and assembles core kernel modules into memory. GRUB functional components are: stage-1 installed on first sector of fdisk partition. stage-2 installed in reserved area in fdisk partition. This is the core image of GRUB. /boot/grub/menu.list The boot behavior can be modified using command eeprom which will edit file /boot/solaris/bootenv.rc. See the file for more info. Update a corrupt boot archive Well, sooner or later you will have to do this, trust me :( Boot "Solaris failsafe" You may get prompt to sync out-of-date boot archive on say /dev/dsk/c0t0d0s0 - do it with Y Mount device with corrupted boot archive on /a And forcibly update corrupted boot archive on alternate root: bootadm update-archive -f -R /a umount /a init 6 Good luck! Tip: setup cron job to run bootadm update-archive on regular basis and do it manually after system upgrade or patch install. Primary boot archive has files below (if any of them is updated, rebuild boot archive with "bootadm update-archive". boot/solaris/bootenv.rc boot/solaris.xpm etc/dacf.conf etc/devices etc/driver_aliases etc/driver_classes etc/match etc/name_to_sysnum etc/path_to_inst etc/rtc_config etc/system kernel platform/i86pc/biosint platform/i86pc/kernel Installing GRUB This is also something you may need to do, say you are mirroring two disks using SVM and want to install GRUB on second disk in case you need to boot from there. So to install GRUB in master boot sector run (replace c1t0d0s0 with yours if needed): installgrub -fm /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c1t0d0s0 Actually, beside primary boot archive, there is one more - Failsafe boot archive. It can boot on its own, require no maintenance and is created during OS installation. SPARC: Installing a Boot Block on a System Disk The following example shows how to install the boot block on a UFS root file system. # installboot /usr/platform/sun4u/lib/fs/ufs/bootblk /dev/rdsk/c0t0d0s0 The following example shows how to install the boot block on a ZFS root file system. # installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/c0t1d0s0 SPARC EXAMPLES The ufs bootblock is in /usr/lib/fs/ufs/bootblk. To install the bootblock on slice 0 of target 0 on controller 1, use: example# /usr/sbin/installboot /usr/lib/fs/ufs/bootblk \ /dev/rdsk/c1t0d0s0 x86 EXAMPLES The ufs bootblock is in /usr/lib/fs/ufs/pboot. To install the bootblock on slice 2 of target 0 on controller 1, use: example# /usr/sbin/installboot /usr/lib/fs/ufs/pboot \ /usr/lib/fs/ufs/bootblk /dev/rdsk/c1t0d0s2 ################################################## ########################### Searchable Keywords: bootblock bootsector boot bootblk installboot - install bootblocks in a disk partition FILES /usr/platform/platform-name/lib/fs/ufs Directory where ufs boot objects reside. SPARC SYNOPSIS installboot bootblk raw-disk-device To install a ufs bootblock on slice 0 of target 0 on controller 1 of the platform where the command is being run, use: Solaris 2.x example# /usr/sbin/installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk \ /dev/rdsk/c1t0d0s0 Solaris 2.8 # installboot /usr/platform/sun4u/lib/fs/ufs/bootblk /dev/rdsk/cXtXdXsX Here is the doc from sun: "The file just loaded does not appear to be executable." This error occurs when the default bootblock has been corrupted. To overcome this problem do the following: 1. Get down to the ok prompt by either typing init 0, hitting <stop> a, hitting <L1> a or unplugging the keyboard. 2. Boot to single usermode from the Solaris 2.6 OS CDROM: OK boot cdrom -s 3. Run an fsck on your disk: #fsck /dev/rdsk/c#t#d#s# 4. Mount to your root file system: #mount /dev/dsk/c#t#d#s# /a 5. Make sure that your the restoresymtable file does not exist. If so, remove it. 6. Next install the bootblock on your disk: #installboot /usr/platform/sun4u/lib/fs/ufs/bootblk /dev/rdsk/c#t#d#s# 7. Unmount your root file system #unmount /a 8. Run another fsck on your disk: #fsck /dev/rdsk/c#t#d#s# 9. Reboot your machine: #init 6 Note: * Loaction of bootblk file may differ depending on the type of hardware platformexample: /usr/platform/sun4u vs /usr/platform/sun4m * Not sure you really need to mount the partition ? I guess to verify you have the right one. ----------------- Other CPU type examples ------------------x86 EXAMPLES To install the ufs bootblock and partition boot program on slice 2 of target 0 on controller 1 of the platform where the command is being run, use: example# installboot /usr/platform/`uname -i`/lib/fs/ufs/pboot \ /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c1t0d0s2 PowerPC Edition EXAMPLES To install the ufs bootblock and openfirmware program on target 0 on controller 1 of the platform where the command is being run, use: example# installboot -f /platform/`uname -i`/openfirmware.x41 \ /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c1t0d0s2 installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk \ /usr/platform/platform-name/lib/fs/ufs Directory where ufs boot objects reside. SunOs-------SunOs---------SunOs-------------SunOs OPTIONS -h Leave the a.out header on the bootblock when installed on disk. -l Print out the list of block numbers of the boot pro- gram. -t Test. Display various internal test messages. -v Verbose. Display detailed information about the size of the boot program, etc. EXAMPLE To install the bootblocks onto the root partition on a Xylogics disk: example% cd /usr/kvm/mdec example% installboot -vlt /boot bootxy /dev/rxy0a For an SD disk, you would use bootsd and /dev/rsd0a, respectively, in place of bootxy and /dev/rxy0a. example: /usr/kvm/mdec/installboot /boot bootsd /dev/rsd3a or % cd /usr/kvm/mdec ./installboot /boot bootsd /dev/rsd3a NOTE:The "/boot" is the boot file that resides in the root directory # redo boot block echo 'doing boot block install' cd /usr/kvm/mdec ./installboot /kahuna/boot bootsd /dev/rsd2a NOTE: inside the /usr/kvm/mdec dir there must be bootsd(for scsi devices) and bootfd(for floppys) if these aren't there installboot isn't going to work. If all goes well you shoulsee something like this...... Secondary boot: /mnt/boot Boot device: /dev/rsd0a Block locations: startblk size 720 10 730 10 740 10 750 10 760 10 770 10 780 10 790 10 7a0 10 7b0 10 7c0 10 7d0 10 7f0 10 800 10 Bootblock will contain a.out header Boot size: 0x1af10 Boot checksum: 0x3a5a8103 Boot block installed The four lines at the bottom are the real tellers ! ! ! ! ! ! ! example detail: /usr/kvm/mdec/installboot -vlt /mnt2/boot bootsd /dev/rsd1a | | | | | v | v (raw device - mounted on /mnt2) v (scsi boot device file) (where sd1a is mounted and boot resides)) Options: -l Print out the list of block numbers of the boot program. -t Test. Display various internal test messages. -v Verbose. Display detailed information about the size of the boot program, etc. ---- More Sunos ------- More Sunos ------- More Sunos ------- More Sunos --- So you have some older Sunos 4.X.x dump images you want to put on another machine. # mount /dev/sd0a on / type 4.2 (rw) /dev/sd0g on /usr type 4.2 (rw) diastolic:/systems/cs712a_dumpimages on /mnt type nfs (rw) <--- image /dev/sd1a on /mnt2 type 4.2 (rw,noquota) <--- new disk # cat /mnt/cs712a.sd0a.01-04-02.Z |uncompress |restore xf – How to recover/reset root password in Sun solaris (SPARC) General, Solaris 10, Solaris 7, Solaris 8, Solaris 9 Add comments Jan 212008 There is every little chance that one loses or rather forgets the root password of his Sun Solaris servers. In the event, this happens, there is a way out of it. Well the way and infact the only way is to reset the password as there is no way to recover it. Recovering/restting the password involves booting the server in Single User mode and mounting the root file system. Ofcourse, it is recommeded that the security for the physical access to the server is restricted so as to ensure that there is no unauthorized access and anyone who follows this routine is an authorized personnel. Boot the server with a Sun Solaris Operating System CD (I’m using a Solaris 10 CD but doesn’t matter really) or a network boot with a JumpStart server from the OBP OK prompt. OK boot cdrom -s or OK boot net -s This will boot the server from the CD or Jumpstart server and launch a single user mode (No Password). Mount the root file system (assume /dev/dsk/c0t0d0s0 here) onto /a solaris# mount /dev/dsk/c0t0d0s0 /a NOTE: /a is a temporary mount point that is available when you boot from CD or a JumpStart server Now, with the root file system mounted on /a. All you need to do is to edit the shadow file and remove the encrypted password for root. solaris# vi /a/etc/shadow Now, exit the mounted filesystem, unmount the root filesystem and reboot the system to singleuser mode booting of the disk. solaris# cd / solaris# umount /a solaris# init s This should boot of the disk and take you to the single-user mode. Press enter at the prompt to enter a password for root. This should allow you to login to the system. Once in, set the password and change to multi-user mode. NOTE: Single-User mode is only to ensure that the root user without password is not exposed to others if started in multi-user mode before being set with a new password. solaris# passwd root solaris# reboot This should do. You should now be able to logon with the new password set for root Posted by admin at 10:42 am Tagged with: password, recovery, root, solaris, sun 6 Responses to “How to recover/reset root password in Sun solaris (SPARC)” Solaris Cluster 3.X: Cluster node paniced with rgmd, rpc.fed, pmfd daemon died some 30 or 35 seconds ago message. Resolution Path (Doc ID 1020514.1) To Bottom Applies to: Solaris Cluster - Version 3.0 to 3.3 [Release 3.0 to 3.3] All Platforms Symptoms This document provides the basic steps to resolve the following failfast panics. Failfast: Failfast: Failfast: Failfast: Failfast: Failfast: Aborting Aborting Aborting Aborting Aborting Aborting zone "global" (zone ID 0) because "pmfd" died 35 seconds ago. because "rgmd" died 30 seconds ago. because "rpc.fed" died 30 seconds ago. because "rpc.pmfd" died 30 seconds ago. because "clexecd" died 30 seconds ago. because "globalrgmd" died 30 seconds ago Cause Solaris Cluster node panics due to a cluster daemon exiting. Solution Why does it happen? The panic message indicates that a cluster-specific daemon shown in the message dies. It is a recovery action taken by failfast mechanism of the cluster monitoring a critical problem. As those processes are critical processes and cannot be restarted, the cluster shuts down the node using the failfast driver. Critical daemons are registered with failfast ff() driver with some time interval. If a daemon does not report back to failfast driver within the registered time interval (eg, 30sec) the driver will trigger a Solaris kernel panic. Troubleshooting steps To find the root cause of the problem, you need to find out why a cluster-specific daemon shown in the messages dies. The followings are steps how to identify the root cause. 1. Check out the /var/adm/messages system log file for messages for system or operation system errors indicating that memory resources may have been limited, such as in the following example. If those messages appears before the panic messages, the root cause would be memory exhaustion since a process may get application core dumping and dies when a system has lack of memory resource. If you find messages indicating lack of memory resource, you will need to find out why the system had lack of or was low in memory and or swap and address it to avoid this panic. If cluster daemon can not fork a new process or can not allocate memory (malloc() ) it will likely to exit and trigger a panic. Apr 2 18:05:13 sun-server1 cl_runtime: [ID 661778 kern.warning] WARNING: clcomm: memory low: freemem 0xfb Another indication is to check for messages reporting that swap space was limited, such as in the following example. Apr 2 18:05:12 sun-server1 genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 25825 (in.ftpd) Apr 2 18:05:03 sun-server1 tmpfs: [ID 518458 kern.warning] WARNING: /tmp: File system full, swap space limit exceeded Example when a daemon or process can not fork processes: Apr 2 18:05:10 sun-server1 Cluster.PMF.pmfd: [ID 837760 daemon.error] monitored processes forked failed(errno=12) For additional information, from a kernel core file, you can see messages given before the panic using the mdb. Check out those messages as well as the /var/adm/messages file. # cd /var/crash/`uname -n` # echo "::msgbuf -v" | mdb -k unix.0 vmcore.0 2. There are some bugs causing cluster daemons to exit (thus this panic) were fixed in the Core Patch or Core/Sys admin those patches, check out if your system still has old patch installed. Check out a the README of patch installed on your machine if there are any relevant bugs fixed between a patch installed and the latest one. Solaris Cluster 3.x update releases and matching / including Solaris Cluster 3.x core patches (Doc ID 1368494.1) The following is the list of some bugs that could be causing this panic and their patches. Note that this is not a comprehensive list, always check MOS for most current bugs: Bug 15529411: SUNBT6784007-OSC RUNNING SCSTAT(1M) CAUSES MEMORY TO BE LEAKED IN RGMD Bug 15507769: SUNBT6747452-3.2U2_FCS RU FAILED- FAILFAST: ABORTING BECAUSE "GLOBALRGMD" DIED 3 Bug 15384429: SUNBT6535144 FAILFAST: ABORTING ZONE "GLOBAL" (ZONE ID 0)BECAUSE "RGMD" DIED 30 Bug 15507579: SUNBT6739317-OSC DURING HA_FRMWK_FI, FAILFAST: ABORTING ZONE "GLOBAL" (ZONE ID 0 Bug 15207664: SUNBT5035341-3.2_FCS FAILFAST: ABORTING BECAUSE "CLEXECD" DIED 30 SECONDS AGO Bug 15108998: SUNBT4690244 FAILFAST: ABORTING BECAUSE "RGMD" DIED 30 SECONDS AGO Bug 15335430: SUNBT6438132 RGMD DUMPED CORE WHILE RESOURCES WERE BEING DISABLED Bug 15345802: SUNBT6460419-3.2_FCS SYNTAX ERROR IN SCSWITCH KILLS RGMD Bug 15282517: SUNBT6312828-3.2_FCS CLUSTER PANICS WITH 'RGMD DIED' PANIC WHEN LD_PRELOAD IS SE Bug 15263959: SUNBT6192133-3.1U4 RGMD CORE DUMPED DURING FUNCTIONAL TESTS ON SC32/SC31U4 CLUST Bug 15273568: SUNBT6290248-3.1U4 RGMD DUMPED CORE WHILE RS STOP FAILED FLAG WAS BEING CLEARED Bug 15335430: SUNBT6438132 RGMD DUMPED CORE WHILE RESOURCES WERE BEING DISABLED Bug 15345802: SUNBT6460419-3.2_FCS SYNTAX ERROR IN SCSWITCH KILLS RGMD Bug 15273568: SUNBT6290248-3.1U4 RGMD DUMPED CORE WHILE RS STOP FAILED FLAG WAS BEING CLEARED Bug 15126659: SUNBT4756973 RGMD USES IDL OBJECT AFTER FAILED IDL CALL IN SCHA CONTROL GIVEOVER 1. The failfast panic will generate a kernel core file, however, in general, it does not help you find a reason why a process dies. But in most of causes, when this panic happens, a process dies due to an application core dumping and this application core file will help you find the root cause. To collect an application core file, use the coreadm command to get core files that are uniquely named and are stored in a consistent place. Run the following commands on each cluster node. mkdir -p /var/cores coreadm -g /var/cores/%f.%n.%p.%t.core \ -e global \ -e global-setid \ -e log \ -d process \ -d proc-setid How to Use Pkgapp to Gather Libraries to Debug Core/Gcore/Crash Files in Solaris and Linux (Doc ID 1274584.1) How to use coreadm to name core files in Solaris (Doc ID 1001674.1) ("-e process" option enables per process coredump for coreadm) If you mad modification to dumpadm test to make sure that your settings are working and you can collect a core: # ps -ef|grep rgmd root 1829 1 0 Dec 24 ? 0:00 /usr/cluster/lib/sc/rgmd root 1833 744 0 Dec 24 ? 3195:27 /usr/cluster/lib/sc/rgmd -z global # gcore -g 1829 gcore: /var/cores/rgmd.clnode1.1829.1393620353.core dumped This will leave rgmd running only collects it's core. # ps -ef|grep rgmd root 1829 1 0 Dec 24 ? 0:00 /usr/cluster/lib/sc/rgmd root 1833 744 0 Dec 24 ? 3195:30 /usr/cluster/lib/sc/rgmd -z global # ls -l /var/cores/rgmd.clnode1.1829.1393620353.core -rw------- 1 root root 22978997 Feb 28 20:45 /var/cores/rgmd.emclu2n1.1829.1393620353.core # file /var/cores/rgmd.clnode1.1829.1393620353.core /var/cores/rgmd.clnode1.1829.1393620353.core: ELF 32-bit MSB core file SPARC Version 1, from 'rgmd' 3. If you have an application core file of cluster-specific daemon, you may want to analyze it. To analyze core file, you can start with this document Solaris Application Core Analysis Product Page (Doc ID 1384397.1). But for quick analysis, use the pstack command. It gives you stack trace of core file and this can be used for searching existing bug from the SunSolve[SM]. It is also good ideal to give your Sun[TM] Services representative a command output given by the pstack command for further analysis. Example: # /usr/bin/pstack /var/cores/rgmd.halab1.7699.1242026038.core How to Enable Method and/or System Coredumps for Cluster Methods That Timeout (Doc ID 1019001.1) hostname:displaynumber.screennumbe