|
Kristian Nielsen
[Recent Entries][Archive][Friends][User Info]
Below are the 20 most recent journal entries recorded in the "Kristian Nielsen" journal:[<< Previous 20 entries]
07:45 pm
[Link] |
Network troubles, part 2
This is a followup
to part 1 of
the story, where I found that a hanging ftp transfer was caused by one of my
network components not being able to transmit certain bit patterns.
After getting on-site, I had the chance to move around cables to test each
component in isolation.
I was quite surprised to learn that the problem seems to be with my old
Linksys WRT54GS router. This router has one WAN Ethernet port, four LAN
Ethernet port that work as a 4-port switch, and a wireless WLAN
"port"/antenna; and the box performs routing/NAT among the three parts.
When I use my test server to send my magic bytes between two hosts both
connected to the 4-port LAN switch on the Linksys, there is no problem, nor
between two hosts both connected to the WLAN. When I send the data in through
the WLAN, and out through the LAN, there is also no problem.
But when I send the data in through the LAN and out through the WLAN, the
transfer hangs due to repeated checksum errors on the data packet. Same if I
send in through the LAN and out through the WAN. And same if I send in through
the WAN and out through either LAN or WLAN.
So, in summary, the problem happens whenever the data is routed inside the
Linksys from any Ethernet port to any other port. But only when it is
routed, not when it is merely switched among the 4 LAN ports. And only when
going into an Ethernet port.
I of course tried re-cycling the power on the Linksys, switching cables, and
switching among the LAN ports. No difference. So I am forced to put the blame
on the little blue Linksys box, which is a pity as it has served me well for
more than 3 years and I had gotten rather fond of it.
So problem solved! Replacing the Linksys, I should be up and running without
problems again. Still, I am left wondering about two questions:
- This is not the kind of problem I would expect to turn up after 3 years
of flawless service; on the other hand I also find it hard to believe that I
would have not discovered this problem for 3 years if it was there from the
start.
- I was very surprised to find this problem connected with the routing. I
would have expected some hardware problem, like a bad cable or marginal
Ethernet port signal processing. But the Ethernet connection works in switch
mode and not in router mode. This sounds like a software problem with the
driver or IP stack inside the Linksys. A bug that depend on a certain bit
string pattern I would not have thought a software problem
(Ie. a
memcpy() but that depends on the bit pattern copied
sounds quite unlikely).
Of well. It is probably some weird hardware problem in the interface between
main board and Ethernet device, and switching happens entirely inside the
device so not affected by the problem. Or something like that.
Tags: debugging, network
|
10:09 pm
[Link] |
Network troubles
As this story shows, the cause of a network problem is not always where you
first suspect...
So I just set up an ftp server on my home network for easy file transfer with
some family members. Everything was working fine, except
... occasionally, file transfers would just hang, for no apparent
reasons. Logs did not say anything.
So I of course first thought that I made some error in the setup of the ftp
server. The server is behind two NAT routers, and ftp is of course tricky with
NAT due to the use of multiple associated connections. I did try during the
setup to properly configure masquerading on the server and correct port
forwarding in both NAT boxes (ports 20 and 21 and some range
of ports for passive ftp connections), but clearly there are potential for
errors here.
So I started by double-checking port configuration. It was fine. So I go read
up on the details of the ftp protocol and the issues with NAT and firewalls
here.
But I still have no ideas. And while my initial guess was a configuration
problem on my part, I start to wonder... I see a file transfer start, and then
after some part of the file has transfered correctly, the transfer hangs. It
is hard to imagine how a port misconfiguration could cause this. Failing to
initiate the transfer yes, but hang in the middle no.
So finally I managed to find a way to reproduce the problem myself, and I
obtained two tcpdumps (each 40Mb...) at each end of the connection. And saw
something quite interesting. When the error occurs, the client receives a data
packet with TCP checksum error. The packet is re-transmitted, but each
retransmission arrives with TCP checksum errors. So this of course explains
the hang. But why the checksum error?
It turns out the problem is with the particular data! So extracting the
offending packet from the tcpdump, I now have a 1448 byte file which cannot
pass through my network uncorrupted :-(:
00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000040: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000060: 0000 7d00 0000 cfbf 0c00 e1e3 8e00 403e ..}...........@>
00000070: 7500 ad8b c500 ccaa 0000 8175 5600 d061 u..........uV..a
00000080: 3000 6ff0 2000 0565 ea00 707d 1b00 de3c 0.o. ..e..p}...<
00000090: d800 123a d800 4676 4000 8175 5600 d061 ...:..Fv@..uV..a
000000a0: 3000 6ff0 2000 0565 ea00 707d 1b00 de3c 0.o. ..e..p}...<
000000b0: d800 123a d800 4676 4000 a3db 0000 fe60 ...:..Fv@......`
000000c0: 0000 a3db 0000 fe60 0000 4000 0000 7eaf .......`..@...~.
000000d0: 0000 4000 0000 7eaf 0000 7f1b 7f1b 7f1b ..@...~.........
000000e0: 7f1b 0000 0000 0000 0000 2d70 2d70 0000 ..........-p-p..
000000f0: 0000 0000 0000 5b30 9370 52be 8a8a 0205 ......[0.pR.....
00000100: 3c30 16cd eebe 6f05 9e30 d9cd f0be f9b6 <0....o..0......
00000110: 7969 a769 f788 2fea 360c 0000 0070 6405 yi.i../.6....pd.
00000120: 8130 3505 6d30 7856 d1d7 ae8a 9120 a069 .05.m0xV..... .i
00000130: c688 03ea 4c81 0000 0000 9205 3930 3b05 ....L.......90;.
00000140: e230 3a04 4d81 be88 8e69 8a20 02b6 5856 .0:.M....i. ..XV
00000150: d80c 0000 0000 0000 0000 4d05 5b30 b256 ..........M.[0.V
00000160: c1d7 dd30 4170 0000 0000 2a04 2a04 0000 ...0Ap....*.*...
00000170: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000180: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000190: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000001a0: 0000 ffff ffff 0000 6c00 0004 56cd 70cd ........l...V.p.
000001b0: 05e7 5e40 7d05 055c 205e e700 e7ea 07be ..^@}..\ ^......
000001c0: 56d8 0000 7d8a ea00 4000 0000 0000 0008 V...}...@.......
000001d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000001e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000001f0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
...
The byte 00 at offset 0000019f is transmitted as ff. Ouch! So I
now have some particular bit pattern being reliably corrupted by my network
connection.
So what was first assumed an ftp server configuration now turns out to be a
nasty network issue :-(. There seems to be four possible sources of the
problem:
- The server network card or driver.
- My own Linksys router.
- The Zyxel router provided by the ISP (Fullrate).
- The ISP network and switch infrastructure.
(The problem cannot be outside of these, as the problem occurs from multiple
external IP providers, and also occurs when connecting from the ftp
server to itself, but looped around the ISP network.)
So now the next step is to test each component in isolation to pinpoint the
offender (unfortunately I cannot isolate the Zyxel router from the ISP without
obtaining an ADSL simulator, which my guess is I could not afford
easily...). But the others can be tested in isolation once I get the change to
get on-site and move around cables.
I put together a small Perl snippet to test this without the complexity of a
full-blown ftp server getting in the way and confusing the ISP tech
support. Running this on the server, I can just run on the client this:
nc HOST 5376 > /dev/null
When run on a working network, this just downloads the magic 1448 bytes. But
on the bad network, the command hangs waiting for a checksum-error-free
retransmission that never comes.
To be continued!
#! /usr/bin/perl
use strict;
use warnings;
use Socket;
# Get the data with the problematic bitstream.
my $data;
open(IN, '<', 'trouble_data.raw')
or die "Failed to read trouble data: $!";
{ local $/= undef;
$data= <IN>;
}
my $listen_port= 5376;
my $proto = getprotobyname('tcp');
socket(SERVER, PF_INET, SOCK_STREAM, $proto)
or die "socket() failed: $!\n";
setsockopt(SERVER, SOL_SOCKET, SO_REUSEADDR, pack("l", 1))
or die "setsockopt() failed: $!";
bind(SERVER, sockaddr_in($listen_port, INADDR_ANY))
or die "bind() failed: $!";
listen(SERVER,SOMAXCONN)
or die "listen() failed: $!";
print "Server started successfully.\n";
my $paddr;
while ($paddr= accept(CLIENT,SERVER))
{
my ($port, $iaddr)= sockaddr_in($paddr);
my $name= gethostbyaddr($iaddr, AF_INET);
print "Got connection from '$name' [", inet_ntoa($iaddr), "], sending data...\n";
print CLIENT $data;
print "Data sent, closing connection.\n";
close CLIENT;
}
Tags: debugging, network
|
08:10 am
[Link] |
Placeholders and SQL injection, part 2
Actually, what I really wanted to blog about before getting
carried
away with irony yesterday was an old idea on how to force my developers to
use placeholders exclusively for SQL queries in applications. As should be
apparent from yesterdays blog entry, I am strongly in favour of using
placeholders for interpolating values into SQL queries, due to the great
reduction in potential bugs (including, but not limited to, SQL injections).
Basically, wrap the database API so that all database access passes through
the wrapper. This can usually be achieved, for example by subclassing DBI (for
Perl) and returning such subclasses from the application connection pool, or
other similar methods. Probably many large web applications already have such
wrappers or use APIs that can be patched or extended appropriately.
Now add code that basically bombs out with a big error message if any SQL
query contains a quote character. Something like "Always use placeholders
for interpolating values into SQL queries! If in disagreement, go see your
development lead for your regular spanking!" or words to the same effect.
Sometimes, the wrapper may sit below some code in the database API that
emulates placeholders (for example, DBD::mysql used to emulate placeholders in
the client using mysql_real_quote_string() or equivalent, since
real server-side placeholders are only available with the newer version of the
MySQL protocol for prepared statements). But even in this case, the wrapper
can still force the use of placeholders by exploiting the fact that MySQL
supports both single (')and double (") quotes. Basically, the wrapper would
set some private global variable at random to either a single or a double
quote, make placeholder emulation use one, and bomb out if the other is
detected in query strings. Then any developer trying to sneak manual quoting
into the application would quickly be caught, and subsequently taught.
The technique is not perfect. It does not catch completely unquoted number
interpolation (shudder). It will also be somewhat of an annoyance to have
to specify all string constants as placeholders (there is nothing wrong with
"SELECT value FROM t WHERE id = ? AND color = 'red'"). In the
end, I never got to implement it, also because my team was small enough and
clue-full enough that normal face-to-face talk was sufficient to make
placeholders be used throughout.
But if I ever find myself as lead or architect for a web application team, I
will be sorely tempted to implement it, as an educational means for the
developers and just to see what reactions it will cause.
Tags: developmentprocess, mysql, perl, security
|
04:06 pm
[Link] |
Placeholders and SQL injection
It is sad to see how 9X% (or should that be 99.X%?) of SQL applications are
riddled with
SQL
injection bugs.
There really is no excuse for this. Nobody writes code like this:
sub stupid_sum {
my ($list) = @_;
my $string = shift @$list;
for (@$list) {
$string .= " + " . $_;
}
my $sum = eval($string);
return $sum;
}
Right? Just because our computers use the
Von
Neumann architecture, where CPU instructions and data is stored in the
same memory, does not mean that we cannot distinguish between code and data
(ok, so in TeX
we do not, but there is a reason TeX is not pleasant to write applications
in).
So when we use functions to group our code into logical units, we have this
fancy syntax for something called parameters. And we can write clever
stuff like this:
int foo(int a) {
return a + 1;
}
And so the "+" and the "1" are part of the code for the function
foo(), while the other value "a" to be added is
data which comes from another part of the program. Great stuff!
In fact, in the old days, people were using something called
embedded
SQL, which tries to keep this distinction for using SQL with another
language. Though I have to admit, having used Oracle ProC, that this was quite
horrible.
But there is no need for embedding the SQL into the syntax of every language,
because these brilliant people invented the placeholder! So now we
can also have parameters for SQL, an SQL string holds the code to be executed,
and placeholders supply the values to be used from other parts of the program.
And it is so easy. No need for mysql_real_quote_string() and
other horrors. Just do like this, using Perl and DBI as example:
sub mark_items {
my ($dbh, $mark, @keys_list) = @_;
my $placeholders = join(",", map("?", @keys_list));
$dbh->do("UPDATE t SET flag = ? WHERE id IN (" . $placeholders . ")",
undef,
$mark, @keys_list);
}
And then you can sleep well at night not worrying about which kind of values
are passed to your SQL, or whether "+" can maybe format your hard disk if
someone passes in the wrong argument.
Just because modern dynamic languages make string concatenation easy, does not
mean that confusing code and data is a good thing. Von Neumann architecture is
good for CPUs, but at higher levels of abstraction we have moved on.
As Bjarne
Stroustrup often says, just as plumbers need an education to be allowed to
mess up pipes, why doesn't a programmer need an education that makes him or her
understand these things before being allowed to release software?
Tags: mysql, perl, security
|
11:32 am
[Link] |
Skal EU tvinge Windows brugere til Firefox?
Der er gang i
diskussionen
vedrørende EU's
mulige
krav til at lade folk vælge browser når de køber Windows. Har Microsoft
misbrugt sit monopol, er det en god ide at EU blander sig, er det for megen
indblanding, osv.
Når man installerer en Linux-distribution skal man typisk ikke vælge
browser... men man har jo også allerede valgt en Linux-distribution. Der er
fri konkurrence på området, så hvis der er efterspørgsel på Linux med andet
end Firefox som default skal det nok også blive muligt at få det.
Men når man vælger en Windows-distribution - nej, man kan ikke vælge
Windows-distribution, for den har Microsoft monopol på at lave. Nøjagtigt
ligesom alle andre, der har udviklet software, og ikke eksplicit fraskrevet
sig den eneret. Det kaldes ophavsret.
Det er jo det, som er problemet. Det er nøjagtigt det samme som med
teleinfrastrukturen og TDC. TDC ejer telenettet ("det rå kobber"), og har
derfor (qua ejendomsretten) som udgangspunkt monopol på at sælge
fastnettelefoni og internetforbindelse. Alle er enige om, at det er uholdbart,
så derfor er der indført regulering på området. Det betyder, at den del af TDC
som ejer kobberet, er forpligtet til at videresælge adgang til andre udbydere
på samme vilkår, som del del af TDC der leverer forbindelserne får. Og det
virker stort set.
Løsningen på software burde være det samme. Andre firmaer får ret til samme
vilkår til at sammensætte og distribuere Windows (andre softwarepakker) som
den del af Microsoft (andre softwarefirmaer) der står for at sælge
slutbruger-licenser. Der skal naturligvis betales den samme per-bruger/licens
pris til udviklingsdelen af Microsoft, samt til eventuelle ophavsrethavere af
andet programmel som inkluderes. Og der skal være fri adgang til at
sammensætte og modificere til det markedet efterspørger.
Men det er jo "tvangslicens", kan man det? Ja selvfølgelig. Staten stiller
hele den udøvende og dømmende magts resourcer til rådighed for Microsoft og
alle andre til at beskytte deres monopol på at distribuere deres software. Det
er jo et kollosalt privilegie. Derfor kan man naturligvis også forlange
modydelser fra statens side som afbalancerer dette privilegie til bedst mulig
gavn for forbrugerne og samfundet.
Man skal huske, at ophavsretten er et tilbud fra statens side, ikke en
pligt. Det står en ophavsrethaver frit for ikke at benytte sin ophavsret, jeg
mener ikke der er nogen ubetingede pligter for ophavsrethaver i lov om
ophavsret, ej heller nogen mulighed for andre end ophavsrethaver til at
håndhæve ophavsretten.
Desværre er det en udbredt forestilling, at de eneretter som lov om ophavsret
giver er en eller anden form for indlysende gud-given ret, ikke et privilegie
som der kan forlanges modydelser for. Så det har nok lange udsigter til at få
lavet ændringer i den retning.
Tags: dansk, it-politik, ophavsret
|
11:51 pm
[Link] |
Selecting rows holding group-wise maximum of a field, part two
Selecting rows holding group-wise maximum is a favorite problem of mine, but
one which only rarely pops up. But for some reason, after my last blog post on
the subject, it seems to be mentioned almost daily around here.
Something that I forgot to mention in the previous post is that most of the
examples there assume suitable indexing is available to get decent
performance. Basically a composite index on both the column(s) in the GROUP BY
and the column over which MAX is computed is needed. In the example I gave,
such an index is available throught the primary key.
However, such an index may not be available in all cases. Maybe maintaining it would be too
expensive, or maybe the data the max is computed over is itself the result of
a (sub-)query, and no indexing is available. So it is worth it also to
understand this case, as the performance of the different possible queries
differ greatly from the indexed case.
So let us modify the original example to not have any useful indexes:
CREATE TABLE object_versions (
id INT PRIMARY KEY AUTO_INCREMENT,
object_id INT NOT NULL,
version INT NOT NULL,
data VARCHAR(1000)
) ENGINE=InnoDB;
This time, I will use a data set of size only 1% of the previous example, as without
indexes some of the queries get ridiculously poor performance. So let us take
10,000 rows, 1000 object each with 10 versions. I use this Perl
long^H^H^H^Hone-liner to load the data:
perl -MDBI -le '$vers=10; $groups=1000; $dbh=DBI->connect("dbi:mysql:", "test",
"pass", {RaiseError => 1}); $dbh->do("USE test"); foreach $o (1..$groups)
{ $dbh->do("INSERT INTO object_versions VALUES " . join(", ", map("(null, ?,?,?)",
1..$vers)), undef, map( ($o, $_, "data_${o}_$_"), 1..$vers)); }'
(Yes, I know... but I have a strange love for Perl one-liners).
Here are the results:
mysql> SELECT data FROM object_versions o1
WHERE version = (SELECT MAX(version) FROM object_versions o2 WHERE o1.object_id = o2.object_id);
1000 rows in set (25.55 sec)
mysql> SELECT o1.data FROM object_versions o1
INNER JOIN (SELECT object_id, MAX(version) AS version FROM object_versions GROUP BY object_id) o2
ON (o1.object_id = o2.object_id AND o1.version = o2.version);
1000 rows in set (0.72 sec)
mysql> SELECT o1.data FROM object_versions o1
LEFT JOIN object_versions o2 ON o1.object_id = o2.object_id AND o1.version < o2.version
WHERE o2.object_id IS NULL;
1000 rows in set (15.44 sec)
mysql> SELECT MAX(CONCAT(version, ":", data)) FROM object_versions GROUP BY object_id;
1000 rows in set (0.52 sec)
mysql> SELECT data FROM
(SELECT * FROM object_versions ORDER BY object_id DESC, version DESC) t GROUP BY object_id;
1000 rows in set (0.04 sec)
The only query that has any kind of decent performance here is last one using
the "evil trick" of abusing the MySQL GROUP BY extensions in a way that is
explicitly documented to not produce well-defined results. Which is really
sad, since it is the only way I know of of getting the database to make the
obvious execution plan in this case, which is to simply sort the data on the
GROUP BY expression, and then loop over the rows picking the max row in each
group on the way.
In fact, in many cases I think a decent alternative is to just select
all rows into the client using ORDER BY, and do the aggregation there.
Now I just need someone to implement my SELECT MAX(object_id,
version) ...
Tags: mysql, sql
|
11:15 pm
[Link] |
Det skal for øvrigt være med kælder!
En entreprenør får bestilling på et hus. Arkitekttegningerne kommer på plads,
byggeriet går igang, der er rejsegilde, og man bliver klar til
aflevering. Bygherren får nøglerne og flytter straks ind, flyttebilen er der
allerede. Og da flyttelæsset er på plads, kommer det fra bygherren: "Hov
forresten, jeg kommer i tanke om noget: Kan I ikke lave det med kælder?"
Dette er næppe hverdagskost i byggebranchen. Jeg ved ikke, om det er teknisk
muligt at tilføje en kælder til et hus uden væsentlig forstyrrelse af
beboerne, men det er i hvert tilfælde ikke cost-effektivt.
Men i software-udvikling er det analoge scenarie
hverdagskost. Grundlæggende ændringer i design og funktionalitet i systemer,
som er i brug. Det er naturligvis en stor styrke, at vi er i stand til det,
men kan man nogen gange savne forståelse hos ikke-teknikere for, hvor meget
mere omkostningsfuldt det bliver, når kælderen skal bygges, efter at beboerne
er flyttet ind. Og det er trist når resultatet, som det ofte sker, bliver en
omgang byggesjusk.
Tags: dansk, developmentprocess
|
11:36 pm
[Link] |
Selecting rows holding group-wise maximum of a field
Today there was a question on the Freenode MySQL channel about a classical
problem: Rows
holding group-wise maximum of a column. This is a problem that I keep
encountering every so often, so I thought I would write up something about it.
A good example of the problem is a table like the following holding versioned
objects:
CREATE TABLE object_versions (
object_id INT NOT NULL,
version INT NOT NULL,
data VARCHAR(1000),
PRIMARY KEY(object_id, version)
) ENGINE=InnoDB
Now it is easy to get the latest version for an object:
SELECT data FROM object_versions WHERE object_id = ? ORDER BY version DESC LIMIT 1
The query will even be very fast as it can use the index to directly fetch the
right row:
mysql> EXPLAIN SELECT data FROM object_versions
WHERE object_id = 42 ORDER BY version DESC LIMIT 1;
+----+-------------+-----------------+------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------+------+---------------+---------+---------+-------+------+-------------+
| 1 | SIMPLE | object_versions | ref | PRIMARY | PRIMARY | 4 | const | 3 | Using where |
+----+-------------+-----------------+------+---------------+---------+---------+-------+------+-------------+
1 row in set (0.00 sec)
But what if we want to select the latest version of all (or some range
of) objects? This is a problem that I think standard SQL (or any SQL dialect
that I know of, including MySQL) has no satisfactory answer to.
Intuitively, the problem should be simple for the database engine to
solve. Just traverse the BTree structure of the primary key (assume InnoDB
clustered index storage here), and for each value of the first part of the
primary key (object_id), pick the highest value of the second
part (version) and return the corresponding row (this is similar
to what I believe is sometimes called index skip scan). However, this
idea is surprisingly difficult to express in SQL.
The first method suggested in the above link to the MySQL manual works in this
case, but it is not all that great in my opinion. For example, it does not work
well if the column that MAX is computed over is not unique per group (as it is
in this example with versions); it will return all of the maximal rows which
may or may not be what you wanted. And the query plan is not all that great either:
mysql> EXPLAIN SELECT data FROM object_versions o1 WHERE version =
(SELECT MAX(version) FROM object_versions o2 WHERE o1.object_id = o2.object_id);
+----+--------------------+-------+------+---------------+---------+---------+-----------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+------+---------------+---------+---------+-----------------------+------+-------------+
| 1 | PRIMARY | o1 | ALL | NULL | NULL | NULL | NULL | 6 | Using where |
| 2 | DEPENDENT SUBQUERY | o2 | ref | PRIMARY | PRIMARY | 4 | einstein.o1.object_id | 1 | Using index |
+----+--------------------+-------+------+---------------+---------+---------+-----------------------+------+-------------+
2 rows in set (0.00 sec)
It is apparently doing a full table scan with an index lookup for every row in the table,
which is not that bad, but certainly more expensive than necessary,
especially if there are many versions per object. Still, it is probably the
best method in most cases (or so I thought first, but see benchmarks below!).
The two other suggestions from the MySQL manual are not perfect either (though
the first one is blazingly fast, see benchmarks below). One is
to use an uncorrelated subquery with a join:
mysql> EXPLAIN SELECT o1.data FROM object_versions o1
INNER JOIN (SELECT object_id, MAX(version) AS version FROM object_versions GROUP BY object_id) o2
ON (o1.object_id = o2.object_id AND o1.version = o2.version);
+----+-------------+-----------------+--------+---------------+---------+---------+-------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------+--------+---------------+---------+---------+-------------------------+------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 3 | |
| 1 | PRIMARY | o1 | eq_ref | PRIMARY | PRIMARY | 8 | o2.object_id,o2.version | 1 | |
| 2 | DERIVED | object_versions | index | NULL | PRIMARY | 8 | NULL | 6 | Using index |
+----+-------------+-----------------+--------+---------------+---------+---------+-------------------------+------+-------------+
At first, I actually did not know exactly how to interpret this plan
output. After the benchmarks given below, I now think this plan is actually
very good, apparently it is first using something like an index skip scan to
compute the MAX() in the uncorrelated subquery, and then looking
up each row using the primary key. It still has the issue with multiple rows
if version was not unique per object.
The other suggestion uses an outer self-join:
mysql> EXPLAIN SELECT o1.data FROM object_versions o1
LEFT JOIN object_versions o2 ON o1.object_id = o2.object_id AND o1.version < o2.version
WHERE o2.object_id IS NULL;
+----+-------------+-------+------+---------------+---------+---------+-----------------------+------+--------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+---------+---------+-----------------------+------+--------------------------------------+
| 1 | SIMPLE | o1 | ALL | NULL | NULL | NULL | NULL | 6 | |
| 1 | SIMPLE | o2 | ref | PRIMARY | PRIMARY | 4 | einstein.o1.object_id | 1 | Using where; Using index; Not exists |
+----+-------------+-------+------+---------------+---------+---------+-----------------------+------+--------------------------------------+
2 rows in set (0.00 sec)
The plan again looks reasonable, but not optimal. And somehow, all three
methods feel unnatural for something that ought to be simple to express.
And in fact, there is a nice way to express this in SQL, except that it does
not work (at least not in MySQL):
SELECT MAX(version, data) FROM object_versions GROUP BY object_id;
If there was just support for computing MAX() over multiple
columns like this, this query would be a nice, natural, and simple way to
express our problem. And it would be relatively easy for database engines to
create the optimal plan, I think index skip scan is fairly standard
already for single-column MAX() with GROUP BY. And
the syntax feels very natural, even though it does bend the rules somehow by a
single expression (MAX(version, data)) returning multiple
columns. I have half a mind to try to implement it myself in MySQL or Drizzle
one day ...
In fact, one can almost use this technique by an old trick-of-the-trade:
mysql> SELECT MAX(CONCAT(version, ":", data)) FROM object_versions GROUP BY object_id;
+---------------------------------+
| max(concat(version, ":", data)) |
+---------------------------------+
| 2:foo2 |
| 1:bar |
| 3:baz2 |
+---------------------------------+
3 rows in set (0.00 sec)
mysql> EXPLAIN SELECT MAX(CONCAT(version, ":", data)) FROM object_versions GROUP BY object_id;
+----+-------------+-----------------+-------+---------------+---------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------+-------+---------------+---------+---------+------+------+-------+
| 1 | SIMPLE | object_versions | index | NULL | PRIMARY | 8 | NULL | 6 | |
+----+-------------+-----------------+-------+---------------+---------+---------+------+------+-------+
1 row in set (0.00 sec)
Though I consider this (and variations thereof) a hack with limited practical
usage.
And speaking of hacks, there is actually another way to solve the
problem, one which I learned about recently at a customer:
mysql> EXPLAIN SELECT data FROM
(SELECT * FROM object_versions ORDER BY object_id DESC, version DESC) t GROUP BY object_id;
+----+-------------+-----------------+-------+---------------+---------+---------+------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------+-------+---------------+---------+---------+------+------+---------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 6 | Using temporary; Using filesort |
| 2 | DERIVED | object_versions | index | NULL | PRIMARY | 8 | NULL | 6 | |
+----+-------------+-----------------+-------+---------------+---------+---------+------+------+---------------------------------+
Get it? This cleverly/evilly (ab)uses the MySQL non-standard extension which
allows SELECT of columns not in the GROUP BY clause even without using
aggregate functions. MySQL will return a value for the column from an
"arbitrary" row in the group. In practise, it chooses it deterministically
from the first row in the group, which is why this trick seems to work well in
practise. But it is clearly documented to not be supported, so not
really something to recommend, though interesting to see.
Bonus benchmark
As a free bonus, I decided to run some quick benchmarks. As it turns out, the
results are quite surprising!
So I filled the table above with 1,000,000 rows, 1000 objects each with 1000
versions. Total table size is about 50Mb or so. I then ran each of the five
above queries:
mysql> SELECT data FROM object_versions o1
WHERE version = (SELECT MAX(version) FROM object_versions o2 WHERE o1.object_id = o2.object_id);
1000 rows in set (4 min 22.86 sec)
mysql> SELECT o1.data FROM object_versions o1
INNER JOIN (SELECT object_id, MAX(version) AS version FROM object_versions GROUP BY object_id) o2
ON (o1.object_id = o2.object_id AND o1.version = o2.version);
1000 rows in set (0.01 sec)
mysql> SELECT o1.data FROM object_versions o1
LEFT JOIN object_versions o2 ON o1.object_id = o2.object_id AND o1.version < o2.version
WHERE o2.object_id IS NULL;
1000 rows in set (2 min 42.72 sec)
mysql> SELECT MAX(CONCAT(version, ":", data)) FROM object_versions GROUP BY object_id;
1000 rows in set (0.63 sec)
mysql> SELECT data FROM
(SELECT * FROM object_versions ORDER BY object_id DESC, version DESC) t GROUP BY object_id;
1000 rows in set (15.61 sec)
The differences are huge!
The clear winner is query 2, the uncorrelated subquery. Apparently it can do
an index skip scan for the inner MAX/GROUP BY query, followed with a primary
key join, so it only ever has to touch 1000 rows. An almost optimal plan.
Query 1 and 3 (correlated subquery and outer join) are spectacularly bad. It
looks as if they are doing something like for each 1,000,000 rows in the full
table scan, do a 1000 row index range scan in the subquery/join, for a total
of 1 billion rows examined. Or something. Not good.
Query 4 and 5, the trick queries, are doing so-so, probably they get away with
a sort of a full table scan of 1,000,000 rows.
Conclusions: Uncorrelated subquery is the undisputed winner!
Tags: mysql, sql
|
06:02 am
[Link] |
Slides for my lightning talks at Open Source Days 2008
In case anyone is interested in a copy of my slides for the two lightning
talks I gave at the Open Source Days 2008 conference, I have
made them available here:
- "Optimizing Large Databases Using InnoDB Clustered Indexes:"
HTML
and
PDF.
- "Profiling with OProfile and Intel Core 2 performance counters:"
HTML
and PDF.
I waqs quite pleased with the benchmark that I prepared for the InnoDB
mini-talk, where I measure the performance difference between clustered and
auto_increment primary key, both with data that fits in memory and with data
that does not. I have wanted to do this benchmark for quite some time, as I
have not really seen real results for this before, though the technique of
using clustered primary keys for performance is well-known.
The results are quite interesting, with clustered indexes being faster for I/O
bound load with more than an order of magnitude. It is also interesting to see
how dead-cheap hardware can do 23k queries/second and read 1000000
rows/second with a properly tuned database.
Tags: innodb, linux, mysql, oprofile, talk
|
07:52 pm
[Link] |
Free SSL certificates, part 2
Previously I
noticed that StartSSL is offering free
SSL certificates, ie. proper certificates with a CA chain that is accepted by
browsers. Now while setting up my own web-server I had the opportunity to try
it out.
It seems to work well. The certificate seems to work, Firefox 3 does not spit
out any errors or warnings as it is otherwise very keen to do on self-signed
certificates. The certificate is obtained without too much hassle, and fairly
quickly, surely much easier than any CA with paper trail requirements.
I did have an issue with the site requiring the browser to send Referer
headers, which I have disabled. Using referer headers for functionality is
usually a poor idea, requiring it goes against the HTTP spec, and there
is no added security since it is trivial to forge it. Anyway, I switched to
using the RefControl
Firefox extension, which is much more flexible that what is built into
about:config. The Block (3rd party) option suits me well.
Other than that, it is just a matter of registering an account (with the
usual email confirmation mail exchange). Then validate your domain, which is
the site's way of making it difficult for someone to make a certificate
claiming ownership of a domain that belongs to another person. Basically you
have to be able to read mail sent to some address associated with the DNS of
the domain, like the one in the SOA record or
postmaster@your.domain, etc.
Now generate the openssl certificate signing request as described in the
Apache/mod_ssl
documentation, and upload it in the appropriate web form. In a matter of
seconds the properly signed server.crt is returned, and can be
installed in Apache using the SSLEngine on, SSLCertificateFile,
SSLCertificateKeyFile, and SSLCertificateChainFile
directives. Ah, and also remember to remove the passphrase on the server key,
or the Apache startup will hang your boot process on the next reboot, waiting
for someone to type the passphrase.
I still find it a hassle to have to go through this just to be able to encrypt
web traffic, something that really should have been standard since long. The
real problem is that the only available standard using https://
URLs mixes two completely different issues: that of encrypting the data
channel, and that of validating that the server is who the client think it
is. Due to this mixup, implementations are left with the dilemma of either
compromising the latter or making the former a hassle, and mainstream browsers
generally choose the hassle option.
At least with StartSSL, encryption is possible both free of cost and
(relatively) free of administrative hassle. But there is still a number of
technical problems and hassle. Like the fact that it is not possible to run
multiple domains on the same IP address without browsers spitting out warnings
en masse. Well, at least my StartSSL certificate seems to work both for
versions of the associated domain both with and without the www.
prefix.
Tags: apache, linux
|
05:55 pm
[Link] |
Hobbit monitoring
After setting up Hobbit
monitoring on my home network, I discovered a curious issue with the Zyxel
P-2602HW router that Fullrate delivers
with their ADSL products: The Telnet administration port occasionally
times-out connection attemps:
I really like the Hobbit
monitoring tool, I have used it ever since I heard about it when I met the
author, Henrik Storner, at a local LUG meeting. It really obeys my favorite
software principle, "simple things should be simple, complex things should be
possible" (and it only helps that it works similarly to BigBrother, which I
learned on a previous job).
Setting it up is litterally as simple as this:
sudo apt-get install hobbit hobbit-plugins
That sets up a lot of useful monitoring (network, disk, memory, cpu, ...),
everything with very reasonable defaults. Then it was just a matter of adding
a few lines in /etc/hobbit/bb-hosts for the routers, other hosts,
and uplink peer node. It will be interesting to see how stable the shiny new
Fullrate connection is!
Hobbit comes preconfigured with a lot of nice stuff. It's not just the basic
stuff, either. It automatically sets up monitoring of configured apt
repositories so that alerts appear when new security patches are
available. And when I set up https/SSL on my local Apache (using StartSSL to obtain a free, valid certificate!), just
adding a https://1.2.3.5/ entry in
/etc/hobbit/bb-hosts provides not only monitoring of the service
running, but also of when the certificate is about to expire, which in almost
a year's time I would be bound to forget otherwise. Custom monitoring is both easy and
flexible to add ass well (haven't needed that on the home network yet, though).
And why do I need to monitor my home network? Well, I believe any network
where one cares at all about it working should be monitored (and if one does
not care, why not turn the power off, save some CO2 pollution?)
Well, I don't really use the Telnet administration port on the router that
often, so it is not a problem for me. Except that of course I want to avoid
alarms about this, as there is probably nothing I can do about it, complaining
to Fullrate does not really seem either worthwhile or likely to help.
Hobbit has a simple way to do this, just add
badtelnet:<ignore>:<warn>:<error> to the entry
in /etc/hobbit/bb-hosts (numbers are levels for ignore, warning,
and error):
1.2.3.4 zyxel # telnet badtelnet:1:3:4
Annoyingly, I first omitted the telnet entry on the line, leaving
only the badtelnet, which neither works nor gives any errors. The
documentation is not extremely clear on this point, so I had to sleep on it
until I vaguely remembered that I might have run into the same problem a
couple of years ago, but had completely forgotten. Hopefully now that I
blogged it I will not forget again!
Tags: linux
|
09:36 pm
[Link] |
Træ-algoritmer
Så er træerne plantet!
Algoritmen til venstre er et espalier af palmette-typen. Det lille et-års
æbletræ (Rød Aroma, der står et Elstar ved siden af) er klippet af 10 cm over
snoren for at frembringe nye skud til foråret som skal bindes vandret ud til
hver side. Næste år klippes der over en ny snor en etager over, og så
fremdeles indtil i alt fire etager.
Algoritmen til højre er et spindeltræ (sødkirsebær af sorten Stella). Alle
grene er bundet ned til vandret med snore, og toppen er klippet for at
frembringe en ny etage sidegrene til foråret.
Der er også sørget for redundans, for det tilfælde at et af træerne ikke
skulle vokse ordentligt til. Uden de ekstra træer ville "time to recovery"
være minimum et år.
Der er ikke plads til alle de træer i haven, så der skulle være gode
muligheder for at tiltuske sig et gratis frugttræ eller to hos mig til næste
efterår...
Så er det bare med at vente på foråret, der sker ingenting de næste 6 måneder.
Tags: dansk, gardening
|
10:50 am
[Link] |
Garbage collection and memory leaks
So, with automatic Garbage Collection, memory leaks are a thing of the past,
right?
Well, not quite, of course. Automatic Garbage Collection, like most other
automated programming techniques, necessarily needs to approximate
(ie. guess), and while the guesses of garbage collectors are generally very
good, they cannot magically predict the future.
The classic way that a program fools the garbage collector is by maintaining a
global list of objects created. This is a common technique for memory
management in non-garbage-collected programming environments, where such lists
will be used to ensure eventual free() of all objects in some
longer-lived manager object.
Unfortunately, if such techniques carry over to garbage collected environment,
the effect is to effectively disable the automatic Garbage Collection
algorithm! Every object allocated will have an outstanding reference in
the global list, causing garbage collection to consider all of it live data,
which cannot be deallocated.
I got hit by this while working on an Adobe Flash application
using Papervision 3D. It was
leaking tons of memory (on the order of 20Mb for every user interaction!).
Turns out that this was a case of the above-mentioned problem. It turns out that
every Papervision 3D MaterialObject3D object registers itself in
the constructor with the MaterialManager singleton, which is
nothing but a global list of allocated material objects. The result is that
no material objects (which all inherit from the
MaterialObject3D class) will get deallocated ever, unless
explicitly done by the application (using
MaterialObject3D::destroy() for example). So much for automatic
Garbage Collection...
A simple Google search reveals that I am not
the
first
one
to be hit by this.
The problem here is a poor design in the Papervision 3D API in this respect,
made even worse by the fact that the documentation does not seem to mention
this aspect at all (at least I could not find it). A constructor that
automagically registers the object in a global list is generally a really poor
idea in garbage collected programming environments.
I am not sure why Papervision 3D does this. Maybe it could just be removed. Or
alternatively, a better approach would be for the API to make it explicit that
such global registration is taking place. For example by forcing the application to
explicitly call some MaterialObject3D::register(), documenting
the need for unregister/destroy, or by not providing a constructor at all, and
instead creating objects in some factory class, again making clear mention of
the need to unregister/destroy.
Tags: adobeflash, gc, programming
|
10:07 am
[Link] |
Lightning talks at Open Source Days 2008
I am giving two lightning talks at Open Source Days on October 3-4. One on improving database I/O performance using clustered indexes with MySQL/InnoDB, and one on advanced profiling with OProfile.
Hope to meet up with a lot of people there!
Tags: freesoftware
|
09:26 pm
[Link] |
Giv tid!
Vores røde ananas-æbler står utroligt flot her en sensommer-eftermiddag:
Jeg holder meget at træer. En af grundene er, at træer tager tid. At plante et
æbletræ er at planlægge mange år frem i tiden. Og ingen forventer, at et træ
kan vokse op på få måneder eller år. Det giver en ro at arbejde med træer.
God software tager også tid. Linux, Apache, Perl, Emacs, GCC, det er alle
projekter som også fandtes for mange år siden mens jeg stadig gik på
Universitetet. Selv KDE er snart 12 år gammelt.
Jeg tror at en vigtig faktor i de mange Open Source successer er, at man har
taget sig tid til at gøre tingene ordentligt. Jeg har alt for tit oplevet,
hvordan manglende forståelse for hvor lang tid softwareudvikling tager har
kørt gode projekter af sporet (og ikke kun indenfor proprietær software, se
bare MySQL 5.0/5.1).
Tags: dansk, freesoftware
|
03:19 pm
[Link] |
Does the Flash plugin play dirty tricks with the X keyboard?
For a client, I am using a web application written in Flash. This
application behaves really strange with respect to the
keyboard on Linux x86_64 (Ubuntu Hardy).
I re-map the caps-lock key as a control key (a required action for any hardcore Emacs user to preserve sanity). Now
the application has a hotkey CTRL-SHIFT-F to search. This hotkey works with
the "real" control keys, but not with the re-mapped caps-lock key.
Not that I am blaming the application... it should not even be possible for it
to tell the difference between different control keys. In fact I cannot think
of _any_ other program that does not work with re-mapped keys. Technically, it
is not even "re-mapped", in X11 all keymaps are created equal.
The flash plugin must be doing some really nasty stuff with
X11. Probably it has hardcoded PC keycodes in the source code instead of using
proper X11 keysyms ... now that would be sucky code.
Tags: linux
|
12:47 pm
[Link] |
BIT-aligned storage
I have been working on a set of efficient C++ classes for storing N-bit
(N<=64) values at arbitrary bit offsets into a buffer. Essentially a
way to address memory bit-wise, rather than the usual byte-wise or word-wise.
The classes support either fixed-sized bitfield storage (eg. say 27-bit
values), or compressed values where small numbers are stored in fewer bits
than large numbers (for example 0-15 are stored in 6 bits, 16-2**24 are stored
in 26 bits, and so on).
The basic idea is of course to save space when storing large amounts of data
eg. for a relational database. Today most systems tend to pad much of their
numbers to 32 or 64 or whatever bits, wasting space. This is done to allow
efficient access, as it is much faster to access data when it is sized and
aligned to one of the word sizes supported by the machine.
Or is it really?
Todays CPUs are so fast compared to I/O (disk, network) and even to main
memory that this may be about to change. In fact my experiments show that
already today, there are cases where working with large data sets in-memory is
as fast or faster when stored at arbitrary bit offset rather than padded to
machine word size. A very interesting, and surprising, result.
For the test, I created a list containing the sizes (in bytes) of each file on
my harddisk. So I consider this "real world" data. The biggest file is
>4GB, so this would normally be considered a 64-bit data set.
The reference code loads the data into an array of 2,000,000 64-bit words (the
list is repeated as necessary to reach a total of 16MB, which exceeds the CPU
L2 cache of 2MB). It measures how long it takes to execute this loop:
uint64_t sum= 0;
for (uint j= 0; j < 1000; j++)
for (uint i= 0; i < 2000000; i++)
sum+= b[i];
}
}
We just load and add up all the values, so the actual data processing here
is minimal (just an add instruction), since we want to focus on the cost of
loading the data in different ways.
I implemented, optimised, and tested a number of different implementations of
reading arbitrary-sized bitfields and compressed numbers. This one
consistently performed the best:
class pack_ptr {
const uint64_t * const base;
uint64_t v;
uint64_t i;
uint64_t bitsleft;
public:
pack_ptr(const uint64_t *_base) :
base(_base), v(0), i(0), bitsleft(0) { }
uint64_t getbits(uint64_t numbits)
{
if (bitsleft >= numbits)
{
uint64_t r= v & (~(uint64_t)0 >> (64 - numbits));
v>>= numbits;
bitsleft-= numbits;
return r;
} else {
uint64_t r= v;
v= base[i++];
r= (r | (v << bitsleft)) & (~(uint64_t)0 >> (64 - numbits));
if (likely(!(numbits == 64 && bitsleft == 0)))
v>>= numbits - bitsleft;
else
v= 0;
bitsleft= bitsleft + (64 - numbits);
return r;
}
}
uint64_t unpack1()
{
uint64_t n= getbits(3);
return getbits(n*9+1);
}
};
The code is pretty straight-forward. The main tricky thing is to ensure that
we never execute a shift by 64 bit, as that is undefined in C/C++. The
condition numbits == 64 is not strictly necessary, but it makes
it possible for the compiler to eliminate the conditional completely when the
value of numbits is fixed, and reduces the risk of CPU
misprediction of the conditional when it cannot be eliminated. The compressed
values here are stored as an initial 3 bits of SIZE, followed by SIZE*9 + 1
bits of the real value. So 0/1 will be stored in 4 bits, 2-1023 will be stored
in 13 bits, and so on. The test number list is compressed down to 29% of the
original 16MB (but still does not fit the CPU cache).
Here then are the loops for fixed-width bitfields and compressed values:
uint64_t sum= 0;
for (uint j= 1000; j > 0; j--)
{
pack_ptr p(base);
for (uint i= 2000000; i > 0; i--)
sum+= p.getbits(33);
}
uint64_t sum= 0;
for (uint j= 1000; j > 0; j--)
{
pack_ptr p(base);
for (uint i= 2000000; i > 0; i--)
sum+= p.unpack1();
}
And here are the results, times in seconds for 1,000 loops of adding 2,000,000
numbers (a total of 2 billion values processed):
| |
Aligned 64-bit words |
Fixed-size bitfield |
Packed values |
| Laptop (Core 2 Duo@2.4GHz) |
4.32s |
4.51s |
10.38s |
| - with two threads |
7.93s |
5.19s |
10.48s |
| Desktop (Core 2 Quad@2.4GHz) |
2.96s |
4.32s |
10.27s |
| - with two threads |
4.53s |
4.38s |
10.27s |
| - with four threads |
9.63s |
5.21s |
10.36s |
The test is run on two machines, one a dual-core laptop and the other a
quad-core desktop machine. As can be seen, the desktop has a somewhat better
memory system, so the maximum throughput is higher. On the laptop, even with
only one thread the fixed-size bitfield is only 5% slower than aligned 64-bit
words. And with two threads it is faster (on both machines) since main memory
bandwidth becomes the bottleneck. On the quad-core with all 4 cores running,
even the packed values are only 8% slower than aligned 64-bit access!
I find the numbers very interesting. Even when reading from main memory,
bit-packing data like this has very modest overhead (sometimes it is even
faster), and very respectable compression ratios (down to 52% and 29% of
original size, respectively). For sending the data to disk or over network,
bit-aligned access can be a big win. And as CPU speed continue to improve
faster than memory and I/O speed, this will only get more interesting (6 and 8
core commodity CPUs are on their way).
The code (still a somewhat rough prototype) is available for anyone interested on
repo.or.cz. It is licensed under
the GPL.
The code generated by GCC 4.2.3 for the fixed-size bitfield loop is quite
illuminating:
.L142:
movq %r12, %r8
subq $33, %r10
shrq $33, %r12
andq %rdi, %r8
addq %r8, %rdx
subl $1, %r9d
je .L133
.L135:
cmpq $32, %r10
ja .L142
movq (%rbx,%r11,8), %rax
movl %r10d, %ecx
addq $1, %r11
movq %rax, %r8
salq %cl, %r8
movl $33, %ecx
orq %r12, %r8
subl %r10d, %ecx
movq %rax, %r12
andq %rdi, %r8
shrq %cl, %r12
addq $31, %r10
addq %r8, %rdx
subl $1, %r9d
jne .L135
(note that GCC puts the loop entry at .L135). That is just 9
instructions for the common case where the value to be fetched fits in a
single 64-bit word, and 17 instructions for the case where the next word
needs to be fetched.
The CPU (Intel Core 2 Duo) is able to extract and process one fixed-size
bitfield in around 5 cycles, and one packed value in around 13 cycles. For a
2.4GHz CPU using 64-bit values, that is 3.84 respectively 1.48 GByte/sec
per CPU core, or 15.36 / 5.91 GByte/sec on a quad-core Q6600 CPU. That
is approaching or exceeding the bandwidth of many memory subsystems even in
modern computers. I am very impressed with the technology of both GCC and
Intel Core 2 Duo/Quad here, the former generating near perfect code and the
latter executing around 2.5 instructions per clock on average in my benchmarks
(that is 10 billion instructions per second total on the quad-core costing
<200 euro!).
In a later posting I plan to describe the putbits() and pack() methods, and
investigate the performance of several different alternative implementations
(initial tests show that putbits() and pack1() are
of the same level performance-wise as getbits() and
unpack1()).
Tags: programming
|
09:10 am
[Link] |
Installing GCC-4.3.1
I sometimes see even seasoned developers be reluctant to use a different gcc
version (ie. to reproduce a potentially compiler-specific issue or just
to experiment with new optimisations or features). The reason is a fear of
breaking their system by installing multiple compilers, or even by replacing
their system compiler with a new one.
But actually it is very easy to install multiple gcc versions without any risk
of "polluting" the development environment. At least on Debian or
Debian-based distributions (I am using Ubuntu) which can automatically install
dependencies.
sudo apt-get build-dep gcc-4.1 g++-4.1
That is a brilliant feature of Debian apt, a life-saver when
compiling sources with complex dependencies.
Here is how it goes:
cd /usr/local/src/
wget ftp://ftp.fu-berlin.de/unix/languages/gcc/releases/gcc-4.3.1/gcc-core-4.3.1.tar.bz2
wget ftp://ftp.fu-berlin.de/unix/languages/gcc/releases/gcc-4.3.1/gcc-g++-4.3.1.tar.bz2
mkdir gcc-4.3.1
cd gcc-4.3.1/
bzip2 -dc ../gcc-core-4.3.1.tar.bz2 | tar xf -
bzip2 -dc ../gcc-g++-4.3.1.tar.bz2 | tar xf -
mkdir build
cd build
../gcc-4.3.1/configure --prefix=/usr/local/src/gcc-4.3.1 --disable-multilib
time make -j4
make install
These days it doesn't even take long to build, just under 15 minutes on a
2.4GHz Q6600 (cheap quad-core Intel).
Now, the new GCC is installed in a completely seperate location, so it does
not interfere in any way with normal work. But to use it to build some
particular software, just do something like this:
PATH="/usr/local/src/gcc-4.3.1/bin:$PATH" ./configure
PATH="/usr/local/src/gcc-4.3.1/bin:$PATH" make
Well, there is one snag, that is why I have the --disable-multilib in
the above ./configure. Otherwise I get this:
In file included from /usr/local/src/gcc-4.3.1/build/./gcc/include-fixed/features.h:355,
from /usr/include/stdio.h:28,
from ../../../../gcc-4.3.1/libgcc/../gcc/tsystem.h:90,
from ../../../../gcc-4.3.1/libgcc/../gcc/libgcc2.c:33:
/usr/include/gnu/stubs.h:7:27: error: gnu/stubs-32.h: No such file or directory
This is some problem with Ubuntu Feisty which puts that include file in
/usr/include/i486-linux-gnu/gnu/stubs-32.h where apparently GCC does
not look for it. I do not need to build any 32-bit binaries, so
--disable-multilib is a quick solution.
Hopefully this is fixed
somehow in Hardy.
Tags: linux, programming
|
12:25 pm
[Link] |
Free SSL certificates
It seems that one can obtain a valid SSL server certificate here at no cost.
I will want to remember that for my next web site. I always found it a bit
silly to not support SSL/https on a web site given that Apache has SSL
support built-in. But it is a bit of a showstopper that encryption is tied in
with certification of identity of the web server, causing the need for buying an
expensive certificate (and having all the hassle of ordering it), or
alternatively have the users being presented with an obnoxious warning about
"invalid certificate".
How come I cannot just create my own SSL key for the purpose of encrypting
securely the communications of my users? For this purpose the certificate is
certainly perfectly valid, despite the browser's claim of the opposite. The
self-signed certificate does not provide any additional guarantee about who
the web site owner is, but that is usually of less concern. So why cannot we
have these two issues being separate?
Or put it another way, how come that a self-signed certificate is considered
invalid, but no certificate (using plain unencrypted http) is perfectly
ok and not flagged by any warning?
Tags: http, linux, ssl
|
12:12 pm
[Link] |
Using GMail from Gnus on Ubuntu Hardy
I have used GNUS (Emacs news/mail reader) to read mail for ages, so I wanted
to use it to access my GMail account as well.
I managed to get it to work on my Ubuntu Hardy installation, though not
without some hacking (I did not spend that long looking for a cleaner way).
GMail supports both POP and SMTP, so that is the basics for getting it to
work, but the problem is that TLS/SSL is required for both and getting that
configured in Gnus was a bit tricky for me. But I got it working in the
following way:
- sudo apt-get install gnutls-bin gnutls-doc starttls
- The smtpmail.el included in emacs21 does not handle TLS (and the
gnus package depends on emacs21, did not check if gnus is available
for emacs22 in Ubuntu Hardy). But Gnus comes with another smtpmail.el
which does support TLS, so I copied that into my load path instead
(/usr/share/doc/gnus/contrib/smtpmail.el.gz).
- I had to hack this small patch into smtpmail.el to get it to use
STARTTLS, not sure what the real problem is here:
--- smtpmail.el 2008/06/06 18:12:50 1.1
+++ smtpmail.el 2008/06/06 18:13:15 1.2
@@ -472,7 +472,7 @@
(let ((cred (smtpmail-find-credentials
smtpmail-starttls-credentials host port)))
(if (null (and cred (condition-case ()
- (with-no-warnings
+ (progn
(require 'starttls)
(call-process (if starttls-use-gnutls
starttls-gnutls-program
- Configure pop and smtpmail in .emacs:
(setq gnus-secondary-select-methods '((nnml "")))
(setq mail-sources
'( (pop :server "pop.gmail.com" :port 995 :user "XXX@gmail.com" :stream ssl)
))
(setq nnmail-pop-password-required t)
(setq smtpmail-smtp-server "smtp.gmail.com")
(setq smtpmail-smtp-service 587)
(setq smtpmail-local-domain nil)
(setq smtpmail-auth-credentials '(("smtp.gmail.com" 587 "XXX@gmail.com" nil)))
(setq smtpmail-starttls-credentials '(("smtp.gmail.com" 587 nil nil)))
(setq send-mail-function 'smtpmail-send-it)
(setq smtpmail-default-smtp-server "smtp.gmail.com")
(setq smtpmail-debug-info t)
(setq message-send-mail-function 'smtpmail-send-it)
So a bit of a pain, but I got it to work (and I am really used to Gnus
by now and would hate to switch to something else).
Tags: gnus, linux, mail
|
[<< Previous 20 entries] |