: > /dev/null

Apr 24, 2017

My favourite commands

I often need to "audit" a GNU/Linux server quickly when dealing with large farm of new servers. After some years, this is what i use:

  • w(1): shorter to type than uptime + who to get load average
# w
 00:05:25 up 51 days,  9:50,  2 users,  load average: 0.01, 0.06, 0.05
USER     TTY      FROM             LOGIN@   IDLE   JCPU   PCPU WHAT
root     pts/0    localhost        22:03    2.00s  3:05   0.01s w
  • pstree(1): shorter to type than ps with many options, output grouped
# pstree
systemd─┬─acpid
        ├─2*[agetty]
        ├─apache2───7*[apache2]
        ├─atd
        ├─collectdmon───collectd───10*[{collectd}]
        ├─cron
        ├─dbus-daemon
        ├─dmeventd
        ├─haveged
        ├─lvmetad
        ├─master─┬─pickup
        │        └─qmgr
        ├─memcached───5*[{memcached}]
        ├─mysqld_safe───mysqld───24*[{mysqld}]
        ├─nginx───2*[nginx]
        ├─nrpe
        ├─ntpd
        ├─puppet───{ruby-timer-thr}
        ├─rsyslogd─┬─{in:imklog}
        │          ├─{in:imuxsock}
        │          └─{rs:main Q:Reg}
        ├─sshd─┬─sshd───bash───pstree
        │      └─sshd───sshd───bash───vi
        ├─2*[systemd───(sd-pam)]
        ├─systemd-journal
        ├─systemd-logind
        └─systemd-udevd
  • lsblk(8): list block devices 'topology' (from util-linux)
# lsblk 
NAME              MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
vda               254:0    0   40G  0 disk 
├─vda1            254:1    0  243M  0 part /boot
└─vda2            254:2    0 39.8G  0 part 
  ├─debian-root   253:0    0 38.9G  0 lvm  /
  └─debian-swap_1 253:1    0  872M  0 lvm  [SWAP]
  • lsb_release(1): show full operating system information (may not be installed everywhere, but is configuration managment system installed such as puppet or inventory system, will be)
# lsb_release --all
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 8.5 (jessie)
Release:        8.5
Codename:       jessie

Some other commands i also use:

  • whatis(1): display on-line manual page description of a binary
  • pgrep(1): shorter to type than his friend ps $your_favourite_options | grep $pattern
# pgrep $pattern
4304
4711

Add --list-name (-l), --list-full (-a) to list name of process and match full command

# pgrep -a mysql
4304 /bin/sh /usr/bin/mysqld_safe
4711 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --log-error=/var/log/mysql/error.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306
  • namei(1): follow pathname and print all attributes of traversed files, useful to verify rights for a non-privilegied user
# namei --long /etc/hosts                                    
f: /etc/hosts
drwxr-xr-x root root /
drwxr-xr-x root root etc
-rw-r--r-- root root hosts
  • tailf(1): shorter to type than tail -f and don't update access time when not needed
  • httpie(1): http client for humans, with colored output, can replace curl -I when debugging webservers/cache
  • tshark(1): 'tiny wireshark', easier to read for humans than venerable tcpdump, same syntax for bpf filters, dissect protocols. Suppose you want to inspect http transactions with minimum details:

tcpdump -nn -i lo port 80 -A

23:43:33.322766 IP 127.0.0.1.55345 > 127.0.0.1.80: Flags [P.], seq 3338438874:3338438967, ack 2071539399, win 1012, options [nop,nop,TS val 1110051799 ecr 1110049301], length 93
E.....@.@..o.........1.P....{y.............
B*..B*..GET /nginx_status?auto HTTP/1.1
User-Agent: collectd/5.4.1
Host: localhost
Accept: */*


23:43:33.329166 IP 127.0.0.1.80 > 127.0.0.1.55345: Flags [P.], seq 1:249, ack 93, win 342, options [nop,nop,TS val 1110051801 ecr 1110051799], length 248
E.., .@.@............P.1{y.....7...V. .....
B*..B*..HTTP/1.1 200 OK
Date: Tue, 07 Jun 2016 21:43:33 GMT
Content-Type: text/plain
Content-Length: 107
Connection: keep-alive
Server: ohmy

Active connections: 4 
server accepts handled requests
 3139 3139 19247 
Reading: 0 Writing: 3 Waiting: 2

tshark -nn -i lo -f 'port 80'

  1   0.000000    127.0.0.1 -> 127.0.0.1    HTTP 159 GET /nginx_status?auto HTTP/1.1 
  2   0.000132    127.0.0.1 -> 127.0.0.1    HTTP 314 HTTP/1.1 200 OK  (text/plain)

Less details, but more readable at first glance

  • ngrep(8): network grep is easier for humans than venerable tcpdump for pretty printing packets content:

ngrep -q -W byline 'port 80'

T 127.0.0.1:55345 -> 127.0.0.1:80 [AP]
GET /nginx_status?auto HTTP/1.1.
User-Agent: collectd/5.4.1.
Host: localhost.
Accept: */*.
.


T 127.0.0.1:80 -> 127.0.0.1:55345 [AP]
HTTP/1.1 200 OK.
Date: Tue, 07 Jun 2016 21:38:23 GMT.
Content-Type: text/plain.
Content-Length: 107.
Connection: keep-alive.
Server: ohmy.
.
Active connections: 4 
server accepts handled requests
 3125 3125 19165 
Reading: 0 Writing: 3 Waiting: 2

You've maybe noted that HTTP/2 is in binary format, so that ngrep will not display content anymore (but tshark will with dissector)

  • multitail(1): tail but open mutiple files and use colors to tail them; pressing enter add a red mark to see when future update of the file occurs (bye bye pressing Enter multiple times !)
  • jq(1): pretty print JSON. This can replace python -m json.tool because jq has..colors ! It also has great powerful filter to manipulate values
  • watch(1): execute a command periodically, bye bye upwards arrow + Enter to replay commands multiple times !
  • netcat(1): venerable telnet is great, but escape sequence maybe hard on some terminals, this is why i prefer netcat; you can also use option to test layer 4 firewalls in scripts:
$ nc -w 3 -v -z www.iroqwa.org 80; echo $?
Connection to www.iroqwa.org 80 port [tcp/http] succeeded!
0
$ nc -w 3 -v -z www.iroqwa.org 8081; echo $?
nc: connect to www.iroqwa.org port 8081 (tcp) failed: Connection refused
nc: connect to www.iroqwa.org port 8081 (tcp) timed out: Operation now in progress
1

Option -w set a timeout, -z don't send any data (to avoid Netcat escape sequence without -z is ^C; wors with both nc.traditional and nc.openbsd

  • view(1): Open file with vim(1) but read-only, this prevent vi to buffering the file and update time of last access. This can be also achieved by pressing v when opening the file with less(1).

As you see, i generally adopt tools that offers a simple output and even colorized to easier viewing. Life is not only black and white on terminals !

Click to read and post comments

Jun 22, 2016

Nginx with proxy cache and SPDY prematurely closes connection

I've enabled SPDY on nginx 1.6.2 shipped with Debian GNU/Linux 8 Jessie.

In Chrome's DevTools Console i saw:

GET https://mysite/myasset.ico net::ERR_CONNECTION_CLOSED

When get my static assets. I started to dig/search a similar issue and find that the problem is not specific to Chrome and reproducible with Firefox. I've found that the combo of nginx + ssl + spdy + proxy_cache was the issue in a Chrome bug.

The problem is that SPDY connection is closed prematurely if proxy caching is used. As SPDY multiplex connections, if the connection is closed, all other transactions are lost.

This bug is fixed in nginx version 1.7.3 of nginx and on i will choose one of the following options:

  • Upgrade nginx by using Debian backport (>= 1.9.10)
  • Disable SPDY, as the protocol is superseded by HTTP/2 and i don't need for this for now (just a one shot test). Nginx 1.9.5 is the first release that implement HTTP/2.

Detail of nginx original bug https://trac.nginx.org/nginx/ticket/428

Click to read and post comments

Sep 02, 2015

pnp4nagios dans Debian Jessie

EDIT 2016-03-31: un backport Debian officiel existe depuis le 8 décembre 2015.

Après upgrade en Jessie, pnp4nagios 0.6.16-2 n'est plus dans l'archive stable. En fait, il est resté installé et les graphes ne s'affichent plus (/pnp4nagios/graph?host=host&srv=service) :

Please check the documentation for information about the following error.

Non-static method nagios_Core::SummaryLink() should not be called statically, assuming $this from incompatible context
file [line]:

application/views/graph_content.php [47]:
back

Pnp4nagios me sert à grapher les perfdata issues de nagios. En plus de lagguer, je n'ai pas vraiment de mérite puisque c'est connu depuis au moins le 29 juin 2014 :

# sed --in-place=.orig "s|error_reporting(E_ALL \& ~E_STRICT);$|error_reporting(E_ALL \& ~E_STRICT \& ~E_DEPRECATED); // see https://bugs.debian.org/752088|" /usr/share/pnp4nagios/html/index.php

Comme il y a pas mal de changement entre la version 0.6.16-2 et celle de testing, j'ai fait un backport, le paquet n'étant pas dans l'archive, il faut aller chercher le paquet source manuellement (et apt-get build-dep ne sait pas encore lire un dsc pour le moment)

# dget --allow-unauthenticated http://http.debian.net/debian/pool/main/p/pnp4nagios/pnp4nagios_0.6.24+dfsg1-4.dsc
# apt-get install dh-autoreconf quilt rrdtool librrds-perl python-jsmin
# cd pnp4nagios*
# dch --bpo
# dpkg-buildpackage

J'en ai profité pour inclure le patch concernant le problème de niveau d'erreur PHP et l'ai poussé ici.

Click to read and post comments

Aug 18, 2015

Nagios nrpe dans Debian Jessie

Ouai je lag un peu par rapport à d'autres (en fait j'ai vu cet article après). Depuis Debian Jessie, il n'est plus possible d'utiliser l'option de configuration dont_blame_nrpe « outofthebox » dans /etc/nagios/nrpe.cfg. En effet, celle-ci a été désactivée à la compilation.

Selon la stratégie, deux options s'offrent à nous pour la numérotation de la version de ce nouveau paquet :

  • 2.15-2: aura pour effet d'être supérieur à 2.15-1 et aux éventuelles mises à jour de sécurité (voir ici pour la référence; s'il y a une mise à jour de sécurité, il y a de grandes chances pour le paquet soit numéroté 2.15-1+deb8u1)
  • 2.15-1local1: aura pour effet d'être supérieur à 2.15-1 mais inférieur lors de nouvelles mises à jour de sécurité. Le paquet local sera donc conservé (on peut s'aider de dpkg --compare-versions pour vérifier mes dires).

Pour ce cas de figure d'un agent de supervision (chacun ses contraintes), je préfère que le paquet ne soit pas conservé en cas de mise à jour, cela me forcera à prendre en compte / étudier des éventuels changements du paquet et à le reconstuire si besoin.

La reconstruction du paquet maintenant :

  • On choisit une machine qui va faire la compilation (en général je préfère utiliser debootstrap pour créer un chroot dédié à cet effet). Une fois le paquet construit, il suffira de l'installer sur d'autres machines Debian Jessie (en partant du principe que c'est la même architecture) via dpkg --install ou via un dépôt local avec apt
  • On installe build-essential (et devscripts pour avoir dch), le paquet source nagios-nrpe et toutes ses dépendances de constructions avec apt-get build-dep qui va les récupérer du paquet source et se débrouiller
  • On rajoute la fameuse option --enable-command-args dans debian/rules, si pas déjà présente
  • On ajoute une entrée dans debian/changelog pour incrémenter la version et facilement identifier nos changements
  • On build !
sudo apt-get install build-essential devscripts lsb-release

# s'assurer qu'on a une source deb-src dans sources.list(.d), sinon en ajouter :
cat << EOF > /etc/apt/sources.list.d/sources.list
deb-src http://ftp.fr.debian.org/debian $(lsb_release --codename --short) main
EOF

sudo apt-get update
sudo apt-get source nagios-nrpe-server
sudo apt-get --yes build-dep !$ && cd nagios-nrpe*

# oneliner pour ne rajouter l'option de compilation que s'il n'est pas déjà présente
grep --quiet "\-\-enable-command-args" debian/rules || \
sed --in-place "/--enable-ssl/a \\\t\t--enable-command-args \\\\" debian/rules

dch --local local "Rebuild with --enable-command-args (see #773840, #756479)"
sudo dpkg-buildpackage

Le paquet pour l'agent nrpe résultant sera nagios-nrpe-server_2.15-1local1_amd64.deb. J'ai buildé mon mien ici.

Click to read and post comments

Jul 18, 2015

Remote IPv4 address migration

I've to switch from one Debian GNU/Linux virtual machine (provided by a friend ;)) to another, but must keep the same IPv4 public address (and stay in the same broadcast domain). The new VM has as temp public IPv4 address to prepare the switch.

When ready, i had some options to use the old IPv4 address on the new VM:

  • Phone my friend to coordinates IPv4 migration
  • Poweroff the old VM and use ip binary to switch from temp to old address

As this VM is part of my sandbox machines (with real hosted services, but no impact or customers depending on them), i've choosen the later option :)

As a system administrator, i never do this kind of things @work and never try to modify single network interface without having KVM, serial over LAN or any other out-of-band management to use tty direct access. Before starting this, it's time to test your login access (yes, i mean the random chars stored on a password database that you never use because you use ssh keypair on your daily tasks).

What i've installed before started:

  • at (yes, you know, the old and venerable)
  • iputils-arping, to have arping and especially -U option to do a gratuitous ARP. This will update neighbours ARP cache (poke gateway) because the MAC address will change for an already seen IPv4 address.

Now, it's 00:23, time to do a poor backup of all files on the old machine and transfert it to new:

old# tar --exclude='/proc' --exclude='/var/cache/rsnapshot' --exclude='/sys' --exclude='/dev' --exclude='/var/cache/apt/archives' --exclude=/root/saturnaab.tar.gz --exclude='/srv/backup/kimsuflol.iroqwa.org' --exclude='/var/lib/puppet/reports' -zcvf saturnaab.tar.gz /
new# scp 192.168.0.1:saturnaab.tar.gz .

Then, i can start to poweroff the old box, bye bye...but wait, when do you born ? Let's find an installation file date that i've never edited:

old# stat /etc/nanorc | grep ^Modify
Modify: 2010-04-15 19:39:40.000000000 +0200

Enough. poweroff i say.

old# poweroff

I used halt before (but this year) to shutdown a system, since systemd is the default init system on Debian, this no more works (because halt works by accident ?).

Now that the old box is not more reachable:

new# ping -c 1 192.168.122.115 -W 2
PING 192.168.122.115 (192.168.122.115) 56(84) bytes of data.

--- 192.168.122.115 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

I can start to break the new box. Hmm, before this, maybe it's good to have a whatchdog to restart automa[tg]ically the eth0 interface by reading notyetuntouched /etc/network/interfaces (remember to remove this if all works, remember to remove this...):

new# echo 'ifdown --force eth0; ifup eth0' | at now + 15 minutes

Start the ip switch:

new# ip addr add 192.168.122.115/32 dev eth0

Note the /32 CIDR mask to be untouched by the kernel.

Try to ping the old address on the new machine:

desktop$ ping -c 1 192.168.122.115 -W 2
PING 192.168.122.115 (192.168.122.115) 56(84) bytes of data.

--- 192.168.122.115 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

Crap, the old ip isn't reachable...need to send a gratuitous ARP packet to update ARP table on the gateway (that i don't have administrative access):

new# arping -s 192.168.122.115 -U 192.168.122.115
ARPING 192.168.122.115 from 192.168.122.115 eth0
^CSent 6 probes (6 broadcast(s))
Received 0 response(s)

I've forced to use the old address (-s) and ask with ARP what is the MAC for myself (O_o)

new# ip addr del 192.168.122.116/27 dev eth0

At this time, the first shell is dead because connected through 192.168.122.116.

And now i can assign the old address with the correct mask and remove the temp the one with /32:

new# ip addr add 192.168.122.115/27 dev eth0
new# ip addr del 192.168.122.115/32 dev eth0

And voila ! Oh no, the at whatchdog command will revert all these efforts, le'ts remove it:

new# atq
1   Wed Jul 22 01:04:00 2015 a root
new# atrm 1

To finish, just need to edit configuration in /etc/network/interfaces to use the old address and remove all occurences of the temp one). Next, a reboot to validate the file confirm me that the configuration is correct. Because if i have to restart the box later, i will not suspect other things to prevent booting correctly...

Source: http://madduck.net/blog/2006.10.20:freeing-the-primary-ip-address/

Click to read and post comments
Next → Page 1 of 3