The diary about a travel which has started with debian.org
Check for duplicates
How to find those files that have different names but exactly the same content?
You could install the good fdupes or you could just reinvent the wheel with bash, md5sum and awk:
find path/ -type f | xargs md5sum | awk '{
sub("[^/]*/","",$2);
if (cache[$1])
print "Found: "cache[$1],$2;
else
cache[$1]=$2
}'
path is where you want to search for duplicates. You can limit the search with the find maxdepth option.
Reverse Dependence of a package
In Debian every package depends on others and thus every package has generally at least another one which depends on it. Every once in a while you could need to know why a given package is present in your Debian machine. Here is how:
Method 1: apt-cache
$ apt-cache rdepends package
Shows all the packages, no matter whether they are installed or not, which depends on package.
Method 2: aptitude
If you, like me, don’t use aptitude very often (i.e. never) you should first update its package db:
# aptitude update
Then:
$ aptitude search '~i~Dpackage'
This command shows all the installed packages which depend on package.
Method 3: hand-made bash script
#!/bin/bash
# usage: ./irdeps.sh package [,package]
# show the packages which depend on a package and are installed
if [ $# -lt 1 ]; then
echo "Usage: $0 package"
exit -1;
fi
while [ "x$1" != "x" ];
do
echo "reverse dependencies for $1..."
while read pack;
do
if grep -q "^Package: ${pack/|/}$" /var/lib/dpkg/status;
then
awk -v pack=${pack/|/} '
/^Package: / && $2 == pack && flag==0{
flag=1;next
}
flag==1 && /^Status: /{
if ($4 == "installed")
print pack;
else
exit;
}' /var/lib/dpkg/status;
fi
done < <(apt-cache rdepends $1 | grep ^[[:space:]])
shift;
done
This script does the same of aptitude in about the same time, but it relies upon dpkg only (and bash+awk of course).
References: algebraicthunk.net/~dburrows.
Mysql and regular expressions
The typical SQL statement is something like this:
SELECT name
FROM accounts
WHERE surname='Steele'
However, in some cases you might need to find out all the tuples having a record that match a particular pattern. How could you do? It’s simple: with regular expressions.
SELECT name
FROM accounts
WHERE surname REGEXP pattern
Examples:
Get all tuples in which the surname begins with “Sm”:
SELECT name
FROM accounts
WHERE surname REGEXP '^Sm'
Get all tuples in which the surname ends with “ith”:
SELECT name
FROM accounts
WHERE surname REGEXP 'ith$'
Get all tuples in which the surname contains “all” or “All”:
SELECT name
FROM accounts
WHERE surname REGEXP '[Aa]ll'
And so on…
How to grep two strings in a file
Using grep we can check if a file contains a given string:
cat file
foo
bar
baz
if grep -q bar file; then
echo file contains bar;
fi
If we need to check if file contains string1 and string2 in the same line in a given order, we still can use grep:
if grep -q 'string1.*string2' file; then
echo file contains string1 and string2 (the former preceding the latter);
fi
However, if the order doesn’t matter, we can use awk:
if awk '/string1/ && /string2/ {exit 0} END{exit 1}' file; then
echo file contains string1 and string2 in the same line;
fi
Now, if we want to check if file contains string1 and string2 even if not in the same line, we can use two greps:
if grep -q string1 file && grep -q string2 file; then
echo file contains string1 and string2;
fi
but doing so, we read two times the same file; that sounds bad, especially if the file is a not very small.
In order to reduce the number of reads, we can use awk (again):
if awk '/string1|string2/{res++} END{if (res>1) exit 0; exit 1}' file; then
echo file contains string1 and string2;
fi
If we want to do the same task to every file in a given directory, we can simply put them in a for cicle:
for file in $dir/*; do
if awk '/string1|string2/{res++} END{if (res>1) exit 0; exit 1}' $file; then
echo $file contains string1 and string2;
fi
done
or we can “slightly” change the awk script:
awk 'FNR==1{if (fn && res>1) print fn" contains string1 and string2"; fn=FILENAME; res=0}
/string1|string2/{res++}
END{if (res>1) print fn" contains string1 and string2"}' $dir/*
Last case: what happens if we need to check recursively (i.e. for every file in every subdirectory)? we can use find:
find $dir -type f -exec sh -c "awk 'FNR==1{if (fn && res>1) print fn \" contains string1 and string2\"; fn=FILENAME; res=0}
/string1|string2/{res++} END{if (res>1) print fn \" contains string1 and string2\"}' \"\$@\"" _ {} +
Lock the screen and save the environment
If you lock the screen with Ctrl+Alt+L (in gnome), a screensaver starts and you’ll be asked a password next time you press a key. That’s a pretty nice thing if you getting a pause from a public PC but from the environment point of view it doesn’t change anything because your machine is going to consume the same power if you are sitting in front of it or not.
You can change the things by using a command to turn your monitor off :
$ sleep 10 && xset dpms force off
then you have 10 seconds for locking the screen in the usual way.
A more friendly way to accomplish the same task is to use a keybinding and a little script:
#!/bin/bash
gnome-screensaver-command -l
xset dpms force off
Now just run gconf-editor and edit /apps/metacity/keybinding_commands/command_1 and /apps/metacity/global_keybindings/run_command_1 or, if you are using compiz, run ccsm and edit the respective values in General Options. No, you can’t use Ctrl+Alt+L but you may choose Ctrl+Alt+K (which I happily use)
The above script is very simple but it’s far from being perfect. Here is something more complex:
#!/bin/bash
# lock the screen and start screensaver
gnome-screensaver-command -l
# store the string returned by "gnome-screensaver-command -q"
# We have to do this because the string is in your language
# which is not predictable :S
activestring=$(gnome-screensaver-command -q)
if [ -z "$activestring" ]; then
echo "Error: null active string!" >&2;
exit -1;
fi
sleep 1;
# if either the screensaver is active AND the user is not entering the password for unlocking the screen then we
# force the screen to remaing turned off.
# In addition, if the the paswword form is active the user has 30 seconds to type in the password. After this period the form
# is killed and we return to the screensaver + monitor switched off situation
while [ "$(gnome-screensaver-command -q)" = "$activestring" ]; do
if ps aux | grep -q gnome-screensaver-[d]ialog; then
sleep 30
if [ "$(gnome-screensaver-command -q)" = "$activestring" ] && ps aux | grep -q gnome-screensaver-[d]ialog; then
killall gnome-screensaver-dialog;
else
exit;
fi
fi
xset dpms force off
sleep 10;
done
The above script comes with two major features:
- the monitor is turned off every 10 seconds if the screen is still locked (sometimes it just tune itself on :O )
- to unlock the screen, the user has 30 seconds to insert a valid password. After that, the monitor is turned off again and the password prompt is killed
On my home PC, the above script allows me to save about 30W (~30% of the total power consumption) while I am away. Well, hibernation (suspend to disk) would be a lot better but often you simply can’t do that.
Know your system
What are the executables you daily use made of? Are they scripts or binaries?
What do developers use? Python? Java? Or perhaps Ruby?
It’s easy and funny to find it out, you just need a couple of bash lines with a pinch of awk:
#!/bin/bash
datafile=$(mktemp /tmp/bintypes.logXXX)
# scan /bin /sbin,/usr/bin,/usr/sbin
for file in /{,s}bin/* /usr/{,s}bin/*; do
rfile=$(readlink -f $file);
# file is a tool which tells what a file contains
[ -f $rfile ] && file "$rfile" >>"$datafile";
done
# count all occurrences with awk
# in some cases we have an 'a ' article in front of the description (e.g. "a python script text executable")
# we cut it away with sub()
awk -F "[:,]" '
{
sub(/ a /," ",$2);
a[$2]++
}
END{
for (el in a)
print a[el] "\t" el
}' "$datafile" | sort -n
rm "$datafile";
Here is my output:
1 ASCII English text
1 awk script text executable
1 /bin/loadkeys script text executable
2 setgid ELF 32-bit LSB shared object
4 /usr/bin/ruby1.8 script text executable
5 setuid setgid ELF 32-bit LSB executable
13 ELF 32-bit LSB shared object
14 setgid ELF 32-bit LSB executable
35 setuid ELF 32-bit LSB executable
61 Bourne-Again shell script text executable
119 python script text executable
225 perl script text executable
377 POSIX shell script text executable
1748 ELF 32-bit LSB executable
Hence, except for the binaries, the most common language used is Posix Shell scripting, followed by Perl and then Python. That lonely awk script is /usr/sbin/mksmbpasswd while the plain text file is just an error: it’s a shell script without the shebang (to whom it may concern, the file is /usr/bin/gnome-power-bugreport.sh). Do these results surprise you?
It must be said that of all the above tools, only 96 belong to Gnu coreutils, the others come from a really huge variety of programs (i.e. TeX, java, console.tools, etc.). In order to know which file belong to which package you have to modify the above script by using “dpkg -S $rfile” instead of file and change the field being counted by awk.
That’s all.