I often need to "audit" a GNU/Linux server quickly when dealing with
large farm of new servers. After some years, this is what i use:
w(1)
: shorter to type than uptime + who to get load average
# w
00:05:25 up 51 days, 9:50, 2 users, load average: 0.01, 0.06, 0.05
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
root pts/0 localhost 22:03 2.00s 3:05 0.01s w
pstree(1)
: shorter to type than ps with many options, output grouped
# pstree
systemd─┬─acpid
├─2*[agetty]
├─apache2───7*[apache2]
├─atd
├─collectdmon───collectd───10*[{collectd}]
├─cron
├─dbus-daemon
├─dmeventd
├─haveged
├─lvmetad
├─master─┬─pickup
│ └─qmgr
├─memcached───5*[{memcached}]
├─mysqld_safe───mysqld───24*[{mysqld}]
├─nginx───2*[nginx]
├─nrpe
├─ntpd
├─puppet───{ruby-timer-thr}
├─rsyslogd─┬─{in:imklog}
│ ├─{in:imuxsock}
│ └─{rs:main Q:Reg}
├─sshd─┬─sshd───bash───pstree
│ └─sshd───sshd───bash───vi
├─2*[systemd───(sd-pam)]
├─systemd-journal
├─systemd-logind
└─systemd-udevd
lsblk(8)
: list block devices 'topology' (from util-linux)
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 254:0 0 40G 0 disk
├─vda1 254:1 0 243M 0 part /boot
└─vda2 254:2 0 39.8G 0 part
├─debian-root 253:0 0 38.9G 0 lvm /
└─debian-swap_1 253:1 0 872M 0 lvm [SWAP]
lsb_release(1)
: show full operating system information (may not be installed everywhere, but is configuration managment system installed such as puppet or inventory system, will be)
# lsb_release --all
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux 8.5 (jessie)
Release: 8.5
Codename: jessie
Some other commands i also use:
whatis(1)
: display on-line manual page description of a binary
pgrep(1)
: shorter to type than his friend ps $your_favourite_options | grep $pattern
# pgrep $pattern
4304
4711
Add --list-name
(-l), --list-full
(-a) to list name of process and match full command
# pgrep -a mysql
4304 /bin/sh /usr/bin/mysqld_safe
4711 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --log-error=/var/log/mysql/error.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306
namei(1)
: follow pathname and print all attributes of traversed files, useful to verify rights for a non-privilegied user
# namei --long /etc/hosts
f: /etc/hosts
drwxr-xr-x root root /
drwxr-xr-x root root etc
-rw-r--r-- root root hosts
tailf(1)
: shorter to type than tail -f and don't update access time when not needed
httpie(1)
: http client for humans, with colored output, can replace curl -I when debugging webservers/cache
tshark(1)
: 'tiny wireshark', easier to read for humans than venerable tcpdump, same syntax for bpf filters, dissect protocols. Suppose you want to inspect http transactions with minimum details:
tcpdump -nn -i lo port 80 -A
23:43:33.322766 IP 127.0.0.1.55345 > 127.0.0.1.80: Flags [P.], seq 3338438874:3338438967, ack 2071539399, win 1012, options [nop,nop,TS val 1110051799 ecr 1110049301], length 93
E.....@.@..o.........1.P....{y.............
B*..B*..GET /nginx_status?auto HTTP/1.1
User-Agent: collectd/5.4.1
Host: localhost
Accept: */*
23:43:33.329166 IP 127.0.0.1.80 > 127.0.0.1.55345: Flags [P.], seq 1:249, ack 93, win 342, options [nop,nop,TS val 1110051801 ecr 1110051799], length 248
E.., .@.@............P.1{y.....7...V. .....
B*..B*..HTTP/1.1 200 OK
Date: Tue, 07 Jun 2016 21:43:33 GMT
Content-Type: text/plain
Content-Length: 107
Connection: keep-alive
Server: ohmy
Active connections: 4
server accepts handled requests
3139 3139 19247
Reading: 0 Writing: 3 Waiting: 2
tshark -nn -i lo -f 'port 80'
1 0.000000 127.0.0.1 -> 127.0.0.1 HTTP 159 GET /nginx_status?auto HTTP/1.1
2 0.000132 127.0.0.1 -> 127.0.0.1 HTTP 314 HTTP/1.1 200 OK (text/plain)
Less details, but more readable at first glance
ngrep(8)
: network grep is easier for humans than venerable tcpdump for pretty printing packets content:
ngrep -q -W byline 'port 80'
T 127.0.0.1:55345 -> 127.0.0.1:80 [AP]
GET /nginx_status?auto HTTP/1.1.
User-Agent: collectd/5.4.1.
Host: localhost.
Accept: */*.
.
T 127.0.0.1:80 -> 127.0.0.1:55345 [AP]
HTTP/1.1 200 OK.
Date: Tue, 07 Jun 2016 21:38:23 GMT.
Content-Type: text/plain.
Content-Length: 107.
Connection: keep-alive.
Server: ohmy.
.
Active connections: 4
server accepts handled requests
3125 3125 19165
Reading: 0 Writing: 3 Waiting: 2
You've maybe noted that HTTP/2 is in binary format, so that ngrep will not display content anymore (but tshark will with dissector)
multitail(1)
: tail but open mutiple files and use colors to tail them; pressing enter add a red mark to see when future update of the file occurs (bye bye pressing Enter multiple times !)
jq(1)
: pretty print JSON. This can replace python -m json.tool because jq has..colors ! It also has great powerful filter to manipulate values
watch(1)
: execute a command periodically, bye bye upwards arrow + Enter to replay commands multiple times !
netcat(1)
: venerable telnet is great, but escape sequence maybe hard on some terminals, this is why i prefer netcat; you can also use option to test layer 4 firewalls in scripts:
$ nc -w 3 -v -z www.iroqwa.org 80; echo $?
Connection to www.iroqwa.org 80 port [tcp/http] succeeded!
0
$ nc -w 3 -v -z www.iroqwa.org 8081; echo $?
nc: connect to www.iroqwa.org port 8081 (tcp) failed: Connection refused
nc: connect to www.iroqwa.org port 8081 (tcp) timed out: Operation now in progress
1
Option -w
set a timeout, -z
don't send any data (to avoid
Netcat escape sequence without -z
is ^C
; wors with both nc.traditional and nc.openbsd
view(1)
: Open file with vim(1)
but read-only, this prevent vi to buffering the file and update time of last access. This can be also achieved by pressing v
when opening the file with less(1)
.
As you see, i generally adopt tools that offers a simple output and even colorized to easier viewing.
Life is not only black and white on terminals !
Click to read and post comments
I've enabled SPDY on nginx
1.6.2 shipped with Debian GNU/Linux 8 Jessie.
In Chrome's DevTools Console i saw:
GET https://mysite/myasset.ico net::ERR_CONNECTION_CLOSED
When get my static assets. I started to dig/search a similar issue and
find that the problem is not specific to Chrome and reproducible with Firefox.
I've found that the combo of nginx + ssl + spdy + proxy_cache was the issue
in a Chrome bug.
The problem is that SPDY connection is closed prematurely if proxy caching is used.
As SPDY multiplex connections, if the connection is closed, all other transactions are lost.
This bug is fixed in nginx version 1.7.3 of nginx and on i will choose one of the following options:
- Upgrade nginx by using Debian backport (>= 1.9.10)
- Disable SPDY, as the protocol is superseded by HTTP/2 and i don't need
for this for now (just a one shot test). Nginx 1.9.5
is the first release that implement HTTP/2.
Detail of nginx original bug https://trac.nginx.org/nginx/ticket/428
Click to read and post comments
EDIT 2016-03-31: un backport Debian officiel existe depuis le 8 décembre 2015.
Après upgrade en Jessie, pnp4nagios 0.6.16-2 n'est plus dans l'archive stable. En fait, il est resté installé et les graphes ne s'affichent plus (/pnp4nagios/graph?host=host&srv=service) :
Please check the documentation for information about the following error.
Non-static method nagios_Core::SummaryLink() should not be called statically, assuming $this from incompatible context
file [line]:
application/views/graph_content.php [47]:
back
Pnp4nagios me sert à grapher les perfdata issues de nagios. En plus de lagguer, je n'ai pas vraiment de mérite puisque c'est connu depuis au moins le 29 juin 2014 :
# sed --in-place=.orig "s|error_reporting(E_ALL \& ~E_STRICT);$|error_reporting(E_ALL \& ~E_STRICT \& ~E_DEPRECATED); // see https://bugs.debian.org/752088|" /usr/share/pnp4nagios/html/index.php
Comme il y a pas mal de changement entre la version 0.6.16-2 et celle de testing, j'ai fait un backport, le paquet n'étant pas dans l'archive, il faut aller chercher le paquet source manuellement (et apt-get build-dep ne sait pas encore lire un dsc pour le moment)
# dget --allow-unauthenticated http://http.debian.net/debian/pool/main/p/pnp4nagios/pnp4nagios_0.6.24+dfsg1-4.dsc
# apt-get install dh-autoreconf quilt rrdtool librrds-perl python-jsmin
# cd pnp4nagios*
# dch --bpo
# dpkg-buildpackage
J'en ai profité pour inclure le patch concernant le problème de niveau d'erreur PHP et l'ai poussé ici.
Click to read and post comments
Ouai je lag un peu par rapport à d'autres (en fait j'ai vu cet article après).
Depuis Debian Jessie, il n'est plus possible d'utiliser l'option de configuration dont_blame_nrpe « outofthebox » dans /etc/nagios/nrpe.cfg. En effet, celle-ci a été désactivée à la compilation.
Selon la stratégie, deux options s'offrent à nous pour la numérotation de la version de ce nouveau paquet :
- 2.15-2: aura pour effet d'être supérieur à 2.15-1 et aux éventuelles mises à jour de sécurité (voir ici pour la référence; s'il y a une mise à jour de sécurité, il y a de grandes chances pour le paquet soit numéroté 2.15-1+deb8u1)
- 2.15-1local1: aura pour effet d'être supérieur à 2.15-1 mais inférieur lors de nouvelles mises à jour de sécurité. Le paquet local sera donc conservé (on peut s'aider de
dpkg --compare-versions
pour vérifier mes dires).
Pour ce cas de figure d'un agent de supervision (chacun ses contraintes), je préfère que le paquet ne soit pas conservé en cas de mise à jour, cela me forcera à prendre en compte / étudier des éventuels changements du paquet et à le reconstuire si besoin.
La reconstruction du paquet maintenant :
- On choisit une machine qui va faire la compilation (en général je préfère utiliser
debootstrap
pour créer un chroot dédié à cet effet). Une fois le paquet construit, il suffira de l'installer sur d'autres machines Debian Jessie (en partant du principe que c'est la même architecture) via dpkg --install
ou via un dépôt local avec apt
- On installe
build-essential
(et devscripts
pour avoir dch
), le paquet source nagios-nrpe
et toutes ses dépendances de constructions avec apt-get build-dep
qui va les récupérer du paquet source et se débrouiller
- On rajoute la fameuse option
--enable-command-args
dans debian/rules
, si pas déjà présente
- On ajoute une entrée dans
debian/changelog
pour incrémenter la version et facilement identifier nos changements
- On build !
sudo apt-get install build-essential devscripts lsb-release
# s'assurer qu'on a une source deb-src dans sources.list(.d), sinon en ajouter :
cat << EOF > /etc/apt/sources.list.d/sources.list
deb-src http://ftp.fr.debian.org/debian $(lsb_release --codename --short) main
EOF
sudo apt-get update
sudo apt-get source nagios-nrpe-server
sudo apt-get --yes build-dep !$ && cd nagios-nrpe*
# oneliner pour ne rajouter l'option de compilation que s'il n'est pas déjà présente
grep --quiet "\-\-enable-command-args" debian/rules || \
sed --in-place "/--enable-ssl/a \\\t\t--enable-command-args \\\\" debian/rules
dch --local local "Rebuild with --enable-command-args (see #773840, #756479)"
sudo dpkg-buildpackage
Le paquet pour l'agent nrpe résultant sera nagios-nrpe-server_2.15-1local1_amd64.deb
. J'ai buildé mon mien ici.
Click to read and post comments
I've to switch from one Debian GNU/Linux virtual machine (provided by a friend ;)) to another, but must keep the same IPv4 public address (and stay in the same broadcast domain). The new VM has as temp public IPv4 address to prepare the switch.
When ready, i had some options to use the old IPv4 address on the new VM:
- Phone my friend to coordinates IPv4 migration
- Poweroff the old VM and use ip binary to switch from temp to old address
As this VM is part of my sandbox machines (with real hosted services, but no impact or customers depending on them), i've choosen the later option :)
As a system administrator, i never do this kind of things @work and never try to modify single network interface without having KVM, serial over LAN or any other out-of-band management to use tty direct access. Before starting this, it's time to test your login access (yes, i mean the random chars stored on a password database that you never use because you use ssh keypair on your daily tasks).
What i've installed before started:
at
(yes, you know, the old and venerable)
iputils-arping
, to have arping and especially -U
option to do a gratuitous ARP. This will update neighbours ARP cache (poke gateway) because the MAC address will change for an already seen IPv4 address.
Now, it's 00:23, time to do a poor backup of all files on the old machine and transfert it to new:
old# tar --exclude='/proc' --exclude='/var/cache/rsnapshot' --exclude='/sys' --exclude='/dev' --exclude='/var/cache/apt/archives' --exclude=/root/saturnaab.tar.gz --exclude='/srv/backup/kimsuflol.iroqwa.org' --exclude='/var/lib/puppet/reports' -zcvf saturnaab.tar.gz /
new# scp 192.168.0.1:saturnaab.tar.gz .
Then, i can start to poweroff the old box, bye bye...but wait, when do you born ? Let's find an installation file date that i've never edited:
old# stat /etc/nanorc | grep ^Modify
Modify: 2010-04-15 19:39:40.000000000 +0200
Enough. poweroff
i say.
I used halt
before (but this year) to shutdown a system, since systemd is the default init system on Debian, this no more works (because halt works by accident ?).
Now that the old box is not more reachable:
new# ping -c 1 192.168.122.115 -W 2
PING 192.168.122.115 (192.168.122.115) 56(84) bytes of data.
--- 192.168.122.115 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms
I can start to break the new box. Hmm, before this, maybe it's good to have a whatchdog to restart automa[tg]ically the eth0 interface by reading notyetuntouched /etc/network/interfaces (remember to remove this if all works, remember to remove this...):
new# echo 'ifdown --force eth0; ifup eth0' | at now + 15 minutes
Start the ip switch:
new# ip addr add 192.168.122.115/32 dev eth0
Note the /32 CIDR mask to be untouched by the kernel.
Try to ping the old address on the new machine:
desktop$ ping -c 1 192.168.122.115 -W 2
PING 192.168.122.115 (192.168.122.115) 56(84) bytes of data.
--- 192.168.122.115 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms
Crap, the old ip isn't reachable...need to send a gratuitous ARP packet to update ARP table on the gateway (that i don't have administrative access):
new# arping -s 192.168.122.115 -U 192.168.122.115
ARPING 192.168.122.115 from 192.168.122.115 eth0
^CSent 6 probes (6 broadcast(s))
Received 0 response(s)
I've forced to use the old address (-s
) and ask with ARP what is the MAC for myself (O_o)
new# ip addr del 192.168.122.116/27 dev eth0
At this time, the first shell is dead because connected through 192.168.122.116.
And now i can assign the old address with the correct mask and remove the temp the one with /32:
new# ip addr add 192.168.122.115/27 dev eth0
new# ip addr del 192.168.122.115/32 dev eth0
And voila ! Oh no, the at
whatchdog command will revert all these efforts, le'ts remove it:
new# atq
1 Wed Jul 22 01:04:00 2015 a root
new# atrm 1
To finish, just need to edit configuration in /etc/network/interfaces to use the old address and remove all occurences of the temp one). Next, a reboot
to validate the file confirm me that the configuration is correct. Because if i have to restart the box later, i will not suspect other things to prevent booting correctly...
Source: http://madduck.net/blog/2006.10.20:freeing-the-primary-ip-address/
Click to read and post comments