Friday, April 27. 2012Fun with Cluster and Locking
I've been dealing with MySQL Cluster in one way or another since around 2005 or so (back in the MySQL 4.1 days) but it is still full of "funny" surprises. This post is a collection of different locking related issue i ran into during the previous weeks that i had not been aware of up to now (or simply may have forgotten over time) == Unique hash indexes lock exclusively == This is the one that regular users are most likely to run into: in general row logs in MySQL Cluster distinguish between reads and writes so that writers can block other writers, but not readers and readers from other transactions always see the last committed row value (Cluster currently only supports the READ COMMITTED isolation level). As soon as you have a secondary unique index in addition to a primary key things are different though. Internally a unique index that is not the primary key is implemented as a unique hash index in Cluster (and optionally also as an additional ordered T-Tree), and for unique hash entries there only seems to be exclusive locking so that writers will block readers across transactions. See http://bugs.mysql.com/65086 for actual examples. == Long running transactions can block starting nodes == Less likely to happen, but very annoying if it happens and you don't know about it: A starting node needs to lock all rows for a very short period of time near the end of start phase 5. At this point it needs to wait for all current transaction to free the locks they are holding (by either COMMIT or ROLLBACK). This is usually not a problem as cluster transaction are not supposed to last very long anyway, but if there happens to be a long running transaction it can potentially block a node start forever. A stopping node will wait for pending transactions for 5 seconds and will then terminate these transactions the hard way, a starting node on the other hand will gracefully wait forever. This is bad in two ways: 1) it is obviously bad availability wise as you may end up with a non-redundant configuration for extended periods of time and without any way to automatically identify and terminate the offending transaction(s) 2) currently the blocked starting node does not even tell what it is waiting for, it just silently sits and waits. So unless you know what is going on you are faced with a node that is simply stuck without doing anything (CPU, disc and network load next to zero) The related bug report is http://bugs.mysql.com/65037 , proposed solutions are meaningful log messages in the short term and killing active transactions after a grace period in the long run. == Weird error message on lock timeout == This last item is related to the non-standard INSERT ... ON DUPLICATE KEY UPDATE construct. What can happen here is that the INSERT part fails with a lock wait timeout as the key is write locked by another transaction. The expected error message in this case would simply be "Lock wait timeout exceed" but what you'll actually get is "Transaction already aborted". What seems to happen here is that the ON DUPLICATE KEY UPDATE part is triggered by any error in the INSERT phase and not only by duplicate key violation. So the INSERT part is tried tried, fails with "Lock wait timeout" (which implicitly rolls back the current transaction), next the UPDATE part is tried, fails with "Transaction already aborted" and overwrites the previous error ... The related bug report is http://bugs.mysql.com/65130 Friday, April 6. 2012Partition fun again, today with ARCHIVE
After the fun with InnoDB and MyISAM and massive partitioning it's time to move on to some other storage engines. So for the next round of fun i chose ARCHIVE:
PS: MariaDB does not seem to be affected ... Monday, March 12. 2012A different kind of static?
While actually trying to check for something completely different (more on that later maybe) So my /usr/local/mysql-5.5.21/bin directory now uses 130MB instead of the 17MB that Wednesday, March 7. 2012Fun with partitions and MyISAM, part #2
This is part of the reply i got on my bug report on the ALTER TABLE issue with MyISAM and a large number of partitions from my previous post:
So this should not only affect ALTER TABLE but all operations on MyISAM tables with large number of partitions should eventually run into this. Lets try and see what happens ... Continue reading "Fun with partitions and MyISAM, part #2" Wednesday, February 29. 2012More fun with partitions, this time with MyISAM
Looks as if MyISAM tries to open all the 1000 partition .MYD/.MYI files at once and runs out of file handles during this operation ... Some fun with partitions and InnoDB ...
Creating or dropping a partitioned table on InnoDB can become a quite expensive operation, on my laptop i'm seeing the following times for a simple table with 100 or 1000 partitions (using 5.1.58 right now as i'm testing on stock Ubuntu 11.10):
So the time to create a partitioned InnoDB table grows linearly with the number of partitions at a 'speed' of about 20 partitions per second, and during that time the hard disk LED is always on. The rate for creating regular InnoDB tables on this machine is about 10 tables per second by the way. MyIsam on the other had can create around 17 tables per second, and creating a single partitioned MyISAM table only takes about a 10th of a second. So what is going on with InnoDB here? First of all from the engines point of view each partition of a partitioned table is an actual table, all the partitioning magic happens on a layer above the storage engine one, and the storage engine only receives handler requests for the actual partition tables involved. On the storage engine layer the actual engines are not aware that these tables are part of a larger partitioned table at all (with the exception of ndbcluster tables, but MySQL Cluster is a different story in this respect anyway). So when creating an InnoDB table with 1000 partitions the InnoDB storage engine actually receives 1000 individual requests to create a table. Now what seems to happen is that for each So this may be something to keep in mind when planning to use large numbers of partitions on InnoDB ... Thursday, February 23. 2012MySQL features i forgot about #1 : slave_compressed_protocol
It is probably about time to re-read the MySQL manual end-to-end as i more and more find myself discovering features i either completely forgot about or which i never was aware of in the first place ... Todays guest is slave_compressed_protocol, an option that has probably been there ever since MySQL 3.23 at least (so that i can't claim that i seem to have missed the ChangeLog entry as i usually do With slave_compressed_protocol enabled the communication between slave and master uses the MYSQL_OPT_COMPRESS option to compress the protocol stream if both sides support it (and it's very unlikely to find an installation that does *not* support it these days), so this can be a big savior if your master and slave are at different sites with only limited bandwidth between the two. Unfortunately this is a global server option though, IMHO this should be part of the options provided by the CHANGE MASTER command, similar to all the SSL encryption related stuff that is part of CHANGE MASTER. Looks as if it is about time for YAFR (Yet Another Feature Request) ... Monday, October 10. 2011PHPreboot braindump
Looks as if we have the next member in the "I want to become Caliph instead of the Caliph" club: PHPreboot This is just a braindump of thoughts on the various bullet points and examples on the projects home page (but i don't think it's worth any more time to analyze it and comment on it either):
Not sure whether this is really an improvement, while $ and ; are not really necessary from a parsers (or lazy typers) point of view they do carry some context information ... so this is breaking the "The burden shall be on the writer, not the readers" principle IMHO
Yes, that's all that needs to be disabled to guarantee secure code ... NOT Magic quotes were a bad idea, but for slightly different reasons. eval() by itself is not a bad idea either if used properly, same as with backticks, system(), popen(), ... And just by disabling magic quotes and eval() you do not make code secure by default, XSS and SQL injections can still happen without these, and i'd bet that most malicious code injections in vulnerable PHP apps were not using eval() but include/require as attack vector to make PHP execute their own code ... want to forbid these, too?
That's a good thing, we may just have different ideas of "full" ...
I do like the "to" part of it ... not really sure about the "from" part ...
Which SQL dialect? And looking at the code: how would i handle queries against multiple database connections?
I personally prefer to write preg_match() literally ... but that may be just me and my PERL allergy ...
I don't see that much gain in that one over explicit fopen(), ... calls either
There's room for improvement in PHP optimizers and op code caches for sure ... In my personal use cases a JVM based PHP would actually be a loss though. Most stuff i'm using PHP for these days is medium complex command line stuff (like my code generators), and for most of this my PHP scripts are already done with their job in a time that would not even be sufficient for a JVM to initialize and execute the very first byte code op of my user code There's also the more general issue that when going the way of the JVM you either go bytecode all the way and have to say goodbye to all the PHP extensions that are actually just thin wrappers around native C libraries, re-implement those libraries functionality in a language supported by the JVM or go the way of JNI ...
... instead of SQLite and ?forgot-the-name-of-the-upcoming-integrated-webserver-thingy? ...?
Sorry, can't see how that's "native XML syntax", PHP and its use of XML processing instruction syntax is as native as you can get, hiding executable code in DATA not so much
Sorry but even that simple example does not really look like PHP to me as all. Maybe part of that is due to the use of "elseif" instead of "else if". I haven't seen any PHP code using "elseif" in years, i even had to double check that it works in this context and not just in the IF: ... ELSEIF: ... ENDIF; form.
Again i'm not sure whether i want to have that on the Syntax level ...
So to summarize:
=> so why should it have the letters PHP in its name at all? Tuesday, October 4. 2011Overriding bind=... in included MySQL option files
Debian and Ubuntu make use of the !includedir directive at the end of the packaged my.cnf to allow for extending and overriding their default configuration without having to modify the distribution config file. This is especially nice in combination with configuration tools like puppet as it is usually much easier to add a file to the system than to modify an existing file. I ran into a small gotcha with this though: the default my.cnf binds mysqld to the localhost TCP interface so that it listens IP 127.0.0.1 only. To make the mysql server reachable from other hosts the bind=127.0.0.1 setting needs to be overwritten. The most obvious would be to simply say bind= , this does not have any effect though. What needs to be done instead is bind=0.0.0.0 to make mysqld listen on all interfaces again. This is obviously a solution for IPv4 only, for allowing incoming connections from all v4 and v6 interfaces allowing bind= without any argument to reset any existing bindings would probably be more clever? Saturday, September 17. 2011Multiple network adapters in Vagrant VMsFor testing some stuff related to MySQL Cluster, port binding and having two network interfaces on the same subnet i needed a bit more than my usual "all nodes on localhost" setup. Looking for a reusable solution i this time did not just simply plug several of my old laptops into the same hup but tried to create a Vagrant setup for this. This quickly lead to the question how to define multiple network adapters per Vagrant VM. The manual is not too clear about this yet, and the key piece of information is not yet to be seen in the network section but is hiding in the section on port forwarding: the :adapter option. So to define multiple adapters you need to add :adapter => $value options on each of the vm.network lines, with $value counting up from one, e.g.: config.vm.network("33.33.33.10", :adapter => 1); In my case i need both adapters to be on the same IP subnet. Vagrant (or VirtualBox?) is not clever enough for this yet and will create two distinct host-only networks for this, routing and binding works as expected though as the kernels inside the VMs do not care about (and are not even aware of) this. Friday, April 15. 2011MySQL Conference talk slidesThursday, March 3. 2011Why go with O(n) if you can have O(n²)?
Found this while waiting for the CSV export of a search result from (*undisclosed project*) ... The export took almost half an hour even though it was just extracting the snail mail addresses of about 40K contacts, so what caused it to only spit out about 20 rows per second? Looking at the PROCESSLIST output a pattern was easy to detect. What greeted me was a single running query which had only been running for a few seconds even though i was already waiting for export results for about 20 minutes: SELECT ... LIMIT 32900, 100; And then a few seconds later: SELECT ... LIMIT 33000, 100; So instead of retrieving the search results for export by firing a single query and processing the results in one go the same pagination logic seemed to be used as for displaying search results. About 400 queries instead of a single one, and each iteration taking more query time, with finally the last one taking about the same time as the only query in the all-at-once alone. Somewhere, somehow, something terribly went wrong it seems .... Monday, November 8. 2010osm2pgsql nor able to read PBF files directlyWith the changes i recently committed to the OSM SVN repository osm2pgsql can now read OSM files in the new binary PBF format directly (with parsing code based on that found in pbf2osm). Files in PBF format are available from the geofabrik download site along with their XML counterparts and are usually about 30% smaller than bzip2 compressed XML. osm2pgsql also parses PBF about twice as fast, so substantially reducing the time taken for the first processing step. Also part of my committed changes are an improved autotools setup that checks for library and header file availability in the configure stage already, and that only includes PBF support if the needed GNU ProtoBuffer related tools and libraries are available. Unfortunately the PBF code requires at least protobuf-c 0.14, which is not yet part of current distributions, so you'll have to install that one from source yourself. But with proper configure checks in place you'll be at least warned about version mismatches here and osm2pgsql will be built without PBF support then. Tuesday, October 12. 2010DirectDownload plugin now with track descriptionThanks to TomH for adding the missing fields to the API response so quickly
Monday, October 11. 2010My first JOSM plugin: Direct GPX track downloadNo more need to manually download tracks you previously uploaded to openstreetmap.org first, then opening the downloaded file with JOSMs "Open file ..." dialog. You can now pick an uploaded track and have it loaded into a new GPX layer directly with the help of the DirectDownload plugin. You can download the compiled plugin .jar or the source code, but be warned: current status is "works for me" aka. "proof of concept". Continue reading "My first JOSM plugin: Direct GPX track download"
(Page 1 of 3, totaling 43 entries)
» next page
|
Calendar
QuicksearchArchivesCategoriesSyndicate This BlogBlog AdministrationChoose LanguageShow tagged entries |
|||||||||||||||||||||||||||||||||||||||||||||||||
