kubelet fails with error “misconfiguration…”

Error:
1.6.0 kubelet fails with error "misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"

Solution:
vi /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
modify
KUBELET_CGROUP_ARGS=--cgroup-driver=systemd
to
KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs

Installing CUDA 7.5 for Tesla M40 on Ubuntu 14.04.5 LTS

Install Driver

  1. Download Tesla driver (http://www.nvidia.com/Download/index.aspx?lang=en-us )
    Picture1
  2. Move to runlevel 3
    $ telinit 3
  3. Stop lightdm service
    $ service lightdm stop

  4. Change file mode of the driver package
    $ chmod +x NVIDIA-Linux-x86_64-352.99.run

Continue reading

Limiting CPU Usage of A Process in CentOS/RHEL 7

In HPC, we may need to protect head node from unnecessary heavy process that may cause login problem for users. One of the solutions is by using cpulimit. We can create a cronjob to monitor all processes and set certain limit for them. This is how I usually did in CentOS/RHEL 7.x.

  1. Install cpulimit package from EPEL repo.

yum install cpulimit

  1. Create a script to monitor the process. The script below is a modified version of the script in this forum. You can modify inputs of the first 3 variables: CPU_LIMIT, BLACK_PROCESSES_LIST, and WHITE_PROCESSES_LIST.

Continue reading

got stuck at “Wait for Plymouth Boot Screen to Quit”

If you can’t get to the login page (booting gets stuck at “Wait for Plymouth Boot Screen to Quit”) after CUDA driver installation, then it’s probably because the kernel is trying to load xorg.conf created by NVIDIA driver. I got this experience in my laptop that has Intel + NVIDIA GPUs running CentOS 7.

Workaround Solution: Continue reading