clawhaven.com

  • Home
  • Resume - CV
  • Examples
  • Contact
 

Lame-A$$ MySQL DBA Technical Interviews...
or how to show candidates that you haven't done your homework! 

By Tim
July 22nd, 2009

You can almost always determine a feckless MySQL interview when it includes any of the questions from this list.  When I hear these questions come out, I immediately inform the interviewer(s) that there is a popular online list of “29 MySQL interview questions” with questions and answers which are poorly selected, worded and in almost every case where the answers are misleading, useless or (at worst) entirely wrong. The questions and answers on that list are astonishingly popular, mostly wrong and studied by a lot of interviewers and prospective candidates as "gospel."  That makes the list dangerous, and it makes this single blog entry a necessity.  If the questions being asked are bad, and their respective, proposed answers share an equal validity, then at least having the facts available would be helpful beforehand.  To this end, I suggest you print off a copy of this entry and bring it to the interview, or refer the interviewers to clawhaven.com.

The only valid excuse to use any of these questions is to see if the candidate studied from the list and uses the BAD answer(s).  Otherwise, the presence of these questions, en-masse, should result in a unabashed public mocking.

Copies of the bad questions and their lame answers are available at :29 MySQL interview questions, MySQL and SQL, MySQL Interview Questions, MySQL Questions and Answers, and on some blog pages like MySQL Interview Questions and Answers.  Those are only a few where these questions and answers are found.

Technical interviews are not that difficult.  They are reasonably simple to construct and administer, and they are even easier to take.  That is only the case if the interviewer put actual preliminary effort into the process as a forethought.  Too many interviewers, in my recent experience, are astonishingly lazy.  That guarantees everyone a dreadful experience followed.

My suggestion is to analyse those aspects of technical expertise actually required by your organization, and construct the test around those.  If you don’t use, nor plan on using replication then questions regarding multi-master replication are beyond pointless; just as if you are a Windows only shop, then questions regarding clustering share an equivalent inanity.  Ask your candidate how they create users, assign or determine permissions, monitor the database, what they watch for, how they determine and enact optimizations and how they handle back-ups, replication, clustering (only if needed - not everyone does or wants clustering or replication).

Simply asking how a prospective DBA might implement your current architecture is usually far more indicative of their proficiency than tossing out the odd-lot of MySQL minutia.  

Here is the list, and my scathing analysis of the answers.  The original questions from the list are presented along with the answer claimed as correct for that question.  Following each is my answer, aka the CORRECT answer.  The list is arranged in rough order according to how wrong each answer was, or how stupid the very question itself was.

(Q) What does myisamchk do?
(A) It compressed the MyISAM tables, which reduces their disk usage.

Their answer is beyond utterly wrong; and that is wholly discouraging because knowing what myisamchk really does is critical to being an actual MySQL DBA!  There is a utility called myisampack which will compress MyISAM tables -- so perhaps the original writer of this question meant that instead.  But the two utilities are entirely different tools.  It’s much like asking “what does a forge do” and then expecting them to describe the function of a pistol.

File that question and it’s answer under holy-craptastic!

The correct answer is : myisamchk is a diagnostic and repair utility provided with MySQL.  It only works with MyISAM tables and only correctly supports wild-cards in Linux (or at least never handles them correctly in Windows).  Here is the official MySQL link for additional information : myisamchk

The utility can check MyISAM tables and their respective indexes for various kinds of problems and inconsistencies.  Given the appropriate flags, the utility will actually repair the problems, when possible.

What myisamchk does NOT do is compress MyISAM tables!  If there is any change to the disk-space used from the before and after the execution of myisamchk, it’s because there was some problem with that table!

(Q) If I created a column with data type VARCHAR(3), what would I expect to see in MySQL table?
(A) CHAR(3), since MySQL automatically adjusted the data type.

That is such a popularly asked question.  It’s probably popular because it’s simple and would express something fundamental about MySQL.  The problem is that the question is vague to the point of uselessness.  Worse still, the answer is entirely wrong and always has been. 

Simply put, the answer provided, that a VARCHAR(3) becomes a CHAR(3) is unambiguously false, that never happens and it never has happened!  If it did happen, it would abjectly jeopardize the efforts of DBAs.

But, pin-headed interviewers will seek to use your answer to pigeon-hole your MySQL experience-level.  No interviewer is likely to actually question that answer regarding an implicit internal change because that would require effort.  But it should be the responsibility of any DBA (skeptics by nature) to test that, and there is no test which supports the claim of the datatype change.

The truly correct answer is : You will see VARCHAR(3) in the table.   It’s that simple!

The correct response during the intervew, however would be to press for details, like : What type of table is this, ie,. What ENGINE was used; was it MyISAM, InnoDB, NDB, etc…?  Then ask what version of MySQLD they are running, was it custom-compiled under GCC or was it a MySQL AB supported binary release?

The interviewees are expected to instantly regurgitate the “MySQL will convert/change/alter or otherwise perform some magical transformation of VARCHAR(3) into CHAR(3)” and will be ready to label you a clueless poser should you not “toe the line”.

As a simple test, try this exercise :

DROP TABLE IF EXISTS two_columns;
Query OK, 0 rows affected (0.06 sec)

CREATE TABLE IF NOT EXISTS two_columns ( first_column char( 3 ), second_column varchar( 3 )) ENGINE=MyISAM;
Query OK, 0 rows affected (0.06 sec)

SHOW CREATE TABLE two_columns;

CREATE TABLE `two_columns` (
  `first_column` char(3) default NULL,
  `second_column` varchar(3) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8

So, that definitely worked to create a MyISAM table with whatever the default character-set is (UTF8 in that case).  You can tell it’s not simply spitting back our original creation script because it’s safe-quoted the table and column-names, it has added the DEFAULT NULL and the DEFAULT CHARSET specifier.

Most critically, notice that the second_column is still VARCHAR( 3 ).

Now, you can change the character-set to Latin1, or any other character-set and run the script again, and in each and every case, the VARCHAR( 3 ) never becomes a CHAR( 3 ).  Furthermore, you can change the ENGINE (table-type) to any other non-Blackhole engine and there is still no change to the column-type of second_column.

Finally, as a test, I had inserted a group of 1500ish  rows, of varying lengths from 1 to 3 characters for each column.  I noted the length of the

Here was the actual insert :

INSERT INTO two_columns ( first_column, second_column ) values ( 'a', 'a' );

Counting bytes by hand, would show 4 bytes per inserted row if it’s real VARCHAR and a CHAR(3), or 6 bytes if they are both CHAR(3).  Ok, since the row-sizes are so small, the overhead will clobber us.  But, there’s a really easy way to find out. 

The SHOW TABLE STATUS for table two_columns indicates a Data_Length of 30,000 bytes for 1500 rows where one column is VARCHAR(3) and the other column is CHAR(3).  If we recreate that table as CHAR(3) for both rows and run the same inserts, the Data_Length is 28,500.  That’s precisely 1500 bytes!   You might think, that is backwards, heck, the varchar makes it longer!  Of course it does, for internal storage reasons.  But that is actually proof that the CHAR(3) isn’t being done, and especially not for spatial-efficiency concerns!

In summation, MySQL doesn’t change the internal create, it doesn’t store column’s definition as CHAR(3) in any detectable place in any discernable manner.  The length of the data clearly indicates that the CHAR(3) is definitely not being used.

So the correct answer is unequivocally, VARCHAR(3) becomes (or stays) VARCHAR(3); it’s not CHAR(3).  MySQL does NOT automatically adjust the type.

Then, feel compelled to mention that only an utter bone-head would pull that question off the web.

(Q) If you specify the data type as DECIMAL (5,2), what’s the range of values that can go in this table?
(A) 999.99 to -99.99. Note that with the negative number the minus sign is considered one of the digits.

I actually hear this one a lot.  Their answer even sounds plausible.  There is a standardhuman practice of indicating number ranges, which is to start from smallest-first and progressing to greatest-last.  Thus it would be -99.99 to 999.99 for the WRONG answer, but at least, it would be consistently specified, as per standard human convention.

The correct answer is -999.99 to 999.99.  I've worked it out below, using examples from real-world testing.  But you could go to MySQL for the real gospel, scroll down (or search for the word "salary") at Numeric Data-Types.they cover precisely that example!

But, let’s test it :

Back to our two_coumn table :

DROP TABLE IF EXISTS two_columns;

CREATE TABLE IF NOT EXISTS two_columns (  `first_column` char(3) default NULL,  `second_column` DECIMAL (5,2) default NULL) ENGINE=MyISAM;

INSERT INTO two_columns ( first_column, second_column ) values ( 'v0', -1001.00 );
INSERT INTO two_columns ( first_column, second_column ) values ( 'v1', -999.999 );
INSERT INTO two_columns ( first_column, second_column ) values ( 'v2', -999.995 );
INSERT INTO two_columns ( first_column, second_column ) values ( 'v3', -999.9949 );
INSERT INTO two_columns ( first_column, second_column ) values ( 'v4', 105.802 );
INSERT INTO two_columns ( first_column, second_column ) values ( 'v5', 999.02 );
INSERT INTO two_columns ( first_column, second_column ) values ( 'v6', 999.994 );
INSERT INTO two_columns ( first_column, second_column ) values ( 'v7', 999.995 );

select * from two_columns;

And we get :

+--------------+---------------+
| first_column | second_column |
+--------------+---------------+
| v3
           | -999.99       |
| v4
           | 105.80        |
| v5
           | 999.02        |
| v6
           | 999.99        |
+--------------+---------------+

As expected, V0 was way too big, and V1 and V2 are a wee bit out of the range.  Notice that V3 and V6 do manage to squeeze in, but they are internally truncated and those digits beyond the third are irretrievably lost, but no warning is issued nor any error raised.  Anything above V7 is clearly verboten, as it raises an error (1264) and is not stored.

(Q) What happens if a table has one column defined as TIMESTAMP?
(A) That field gets the current timestamp whenever the row gets altered.

Generally, that’s a decent explanation.  But the handling of TIMESTAMP has changed as of 4.1.

There is the UTC concern, which means that if you look at the value in that row and then another person looks at the same value in the same row, they may very well NOT see the same data and that behavior is by design!

More significantly, the handling of updates and inserts is complicated.  You have to look at the lengthy MySQL description of the current TIMESTAMP logic online.

The reason I put a mostly correct answer all the way up here is because it’s follow-up is so incredibly beyond silly.

(Q) But what if you really want to store the timestamp data, such as the publication date of the article?
(A) Create two columns of type TIMESTAMP and use the second one for your real data.

What?!  That’s wrong!  Furthermore, it’s stupid too.  Even MySQL documentation says it's wrong at datetime.

You can store actual data in the timestamp, whether it’s the first or even only TIMESTAMP!  But honestly, just make a DATETIME field, and avoid all the headache!

(Q) Explain advantages of MyISAM over InnoDB?
(A) Much more conservative approach to disk space management - each MyISAM table is stored in a separate file, which could be compressed then with myisamchk if needed. With InnoDB the tables are stored in tablespace, and not much further optimization is possible. All data except for TEXT and BLOB can occupy 8,000 bytes at most. No full text indexing is available for InnoDB. TRhe COUNT(*)s execute slower than in MyISAM due to tablespace complexity.

Firstly, they never indicate the primary reason to go with MyISM.  It’s faster!  It’s the choice for read-mostly/write-rarely tables.

The only portion for which I’d give even the iota of credit would be for is the FULL-TEXT index criterion.

Holy hemorrhoids – there they go again with compressing via myisamchk! 

The 8,000 bytes thing is misleading in only every possible way!  By misleading, of course, I mean it’s totally wrong.  The author probably meant that the entire row can only be a maximum of 8k bytes.  That’s wholly wrong, because each row can be as much as 65,536 bytes in length (or substantially more if TEXT or other blob-like field-types are used). Now, on the other hand, NDB tables are limited to 8k in per-row length (but only the NDB engine, ie., clustering uses that, which is completely NOT Innodb).

Actually, the answer-statement, as originally written, means that MyISAM cannot have a table of more than 8 kilobytes which is utterly preposterous (ie., wrong).

The COUNT(*) is only usually slower and that’s only if the InnoDB table doesn’t have a primary-key.  And the reason isn’t table-complexity, but it’s based on how the SELECT is analysed and evaluated.  From my understanding, the row-count for a MyISAM table is maintained in the FRM.  However, all accessed KEY/INDEX fields are loaded into memory as used, so that FRM or primary-key from an InnoDB should be near instantaneously pulled.

(Q) What happens when the column is set to AUTO INCREMENT and you reach the maximum value for that table?
(A)  It stops incrementing. It does not overflow to 0 to prevent data losses, but further inserts are going to produce an error, since the key has been used already.

Darned, that is so close to correct, it’s almost sad to eviscerate this one.

What he should have said was that when the auto-increment value exceeds the maximum value supported for that specific column’s data-type, then, and only then, will errors be generated on each successive insert.  He is absolutely correct, it doesn’t wrap to zero.

But he is wrong in that the errors are coming from an already used key.  His example said only that a column is auto-increment, not that the column is the primary key, although AUTO_INC’s have to be either the key or they must be at least a part thereof. 

The actual error you get is : ERROR 1264: Out of range value adjusted for column 'id' which isn’t a key error, but a numeric data range type error.

(Q) What are HEAP tables in MySQL?
(A) HEAP tables are in-memory. They are usually used for high-speed temporary storage. No TEXT or BLOB fields are allowed within HEAP tables. You can only use the comparison operators = and <=>. HEAP tables do not support AUTO_INCREMENT. Indexes must be NOT NULL.

Of the answers provided by that author, this is the one that is closest to being correct.  But it still fails and a few significant points.  If he stopped before providing the limitations then he’d have been quite spot-on correct!

Text and Blob fields are still verboten.  And HEAP or MEMORY (synonyms) are fast, and never written to disk so they are temporary.

However, some things have changed and those are genuinely critical changes which interviewers are typically not aware of.  Heap/Memory table Indexes can contain NULLs!  AUTO_INCREMENT is supported.  Also, comparison operators are not limited to equals/greater-than/less-than or GTE/LTE in that  LIKE is supported, IS NOT NULL is supported and so on.

Sure, you can use the NET START servicename, but the service-name is not always MYSQL.  Nowadays, it’s more likely to be MySQL41, MYSQL5 or something else.  And, since it’s Windows, why are you even using command-line methods.  In Windows, go to the Administrative Tools, and under SERVICES, check the MySQL service(s) and start or stop them that way.

(Q) How do you start MySQL on Linux?
(A) /etc/init.d/mysql start

That is definitely correct, but there’s also service mysqld start and at least another couple of ways.

(Q) What does tee command do in MySQL?
(A) tee followed by a filename turns on MySQL logging to a specified file. It can be stopped by command notee.

That should be totally reworded as : What does the TEE command do in the MySQL Command-Line client application mysql? Otherwise, the normal person would wonder what TEE does to MySQL the server?! 

It’s hyper-minutia, in that it applies only to a minor feature of a command-line client which features other methods, such as redirection, for supporting the same functionality.

(Q) What are CSV tables?
(A) Those are the special tables, data for which is saved into comma-separated values files. They cannot be indexed.

That is a little too little on the information-giving side.

CSV is a storage engine capable of reading and writing to comma-separated values files as though it were a standard table-type.  It’s true that you cannot index it.  Also, it’s important that it was not supported in Windows until 5.1 (in other words, until practically last week). 

(Q) What are some good ideas regarding user security in MySQL?
(A) There is no user without a password. There is no user without a user name. There is no user whose Host column contains % (which here indicates that the user can log in from anywhere in the network or the Internet). There are as few users as possible (in the ideal case only root) who have unrestricted access.

Incredibly that’s mostly correct.  Of course, “good ideas” is too vague.  Other “good ideas” would be to have passwords change each week, and have passwords and user-names each being at least 10 characters each, with upper and lower-case and digits.

The host column containing the percent-sign caveat is poorly worded, unrealistic in many circumstances but more critically, it lacks any indication of what should be placed there instead. 

(Q) Explain federated tables.
(A) Introduced in MySQL 5.0, federated tables allow access to the tables located on other databases on other servers.

Gee, that’s not too vague. I’d never take that as an answer and neither should an interviewer.

More details should be specified like… the federated table is really an effective LINK to a remote table.  It functions much like a remote synonym or view in that the data never “exists” on the local server.  It should be mentioned that the credentials are stored / specified in the CREATE of the FEDERATED table.  In newer versions of MySQL, it could be mentioned that one can point to multiple tables or even multiple servers.

It should also be mentioned that FEDERATED tables are almost always a bad idea if joining or complex query access is to be handled.

(Q) What is SERIAL data type in MySQL?
(A) BIGINT NOT NULL PRIMARY KEY AUTO_INCREMENT

Well, in the years I’ve been helping with MySQL databases, I’ve actually never seen anyone use this data-type, it comes up in a lot of interviews.  It’s used in some academic and test databases, but I’ve never seen it in the field.

It’s basically a big primary-key field as a legacy from PostGreSQL.  Fully qualified, it’s an internal synonym, from 4.1 onwards, for BIGINT UNSIGNED NOT NULL AUTO_INCREMENT UNIQUE.

Notice, I didn’t say PRIMARY KEY in that alias, because MySQL doesn’t say primary-key.

(Q) Explain the difference between BOOL, TINYINT and BIT.
(A) Prior to MySQL 5.0.3: those are all synonyms. After MySQL 5.0.3: BIT data type can store 8 bytes of data and should be used for binary data.

That’s a trick question.  It’s also minutia, it means nothing to pretty much every real DBA. 

A BOOLEAN or BOOL and a TINYINT are single bytes.    Except for BIT, which was only recently introduced and is of very limited practical use.

(Q) Explain the difference between FLOAT, DOUBLE and REAL.
(A) FLOATs store floating point numbers with 8 place accuracy and take up 4 bytes. DOUBLEs store floating point numbers with 16 place accuracy and take up 8 bytes. REAL is a synonym of FLOAT for now.

This one is important, and I’ll let it slide even though it’s general.  At least it is concise.

(Q) Explain data type TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
(A) The column exhibits the same behavior as a single timestamp column in a table with no other timestamp columns.

That’s mostly correct, I wouldn’t correct it except to indicate that its’ brevity prevents it from being of any use.

(Q) What does TIMESTAMP ON UPDATE CURRENT_TIMESTAMP data type do?
(A) On initialization places a zero in that column, on future updates puts the current value of the timestamp in.

This is actually correct.

(Q) Explain TIMESTAMP DEFAULT ‘2006:09:02 17:38:44' ON UPDATE CURRENT_TIMESTAMP.
(A) A default value is used on initialization, a current timestamp is inserted on update of the row.

This, too, is right.  That's about three so far in the entire list.

(Q) How do you start and stop MySQL on Windows?
(A) net start MySQL, net stop MySQL

That’s not entirely wrong.  It is just not entirely correct.

You start and stop mysql in WINDOWS by executing the MYSQLD.exe or Mysqld-nt or the mysqld-max (etc…) applications. 

Furthermore, the service-name is used, and that is rarely MySQL anymore.  It's often MySQL51, MySQL4 or anything the user selects.  So the net start or net stop portions are correct but the detail about the service name are not really optional.  Also, this is Windows only, so if your facility doesn't have or support MySQL installed under Windows, then why even ask this?

(Q) What’s the default port for MySQL Server?
(A) 3306

Well, honestly, every MySQL professional should know that one absolutely cold.  But more details on where it’s used, how to change it, or even the fact that it can be changed would be nice.

(Q) Explain the difference between MyISAM Static and MyISAM Dynamic.
(A) In MyISAM static all the fields have fixed width. The Dynamic MyISAM table would include fields such as TEXT, BLOB, etc. to accommodate the data types with various lengths. MyISAM Static would be easier to restore in case of corruption, since even though you might lose some data, you know exactly where to look for the beginning of the next record.

Fundamentally, this is true.  Technically, this is useless.

 

Busy week coming up...

By Tim
July 13th, 2009

The MySQL Best Practices document is coming along pretty quickly.  Like many projects, once you've significantly completed any section, one or more unaddressed points arise, like the heads of a hydra.

I've been installing a "team" or cluster of Centos and Ubuntu servers here at home for testing.and educational purposes.  Does anyone else agree that Ubuntu just has a much happier "feel" from an utterly ineffable vantage?  Hours ago, I felt as though I was living the Robert-Frost quote from Stopping By Woods on a Snowy Evening, with the line "and miles to go before I sleep, and..." but now I'm way beyond that and the night has clearly taken on a more Lovecraftian ambiance -- and Poe's Raven feels way more apropos (or apres-Poe).

Now, I'm really good at Concrete5.  And I'm practicing Drupal too, and have only two more installs of Drupal to finish by tomorrow.

Well, 3:00 AM draws nigh, and I really want to be in bed before π-o-clock (fourteen minutes after 3).  Did ya know that pi is Greek for.. umm.. pi, actually.  It's one of their letters, but it's used to represent the ration of the perimeter to the diameter, as it's a quasi-abbreviation of the Greek word περίμετρος (which I think is (phonetically) perimetros because almost everything in Greek ends in an S if it's not a vowel, but that's clearly the word perimeter).

Wicked Busy Again

By Tim
July 9th, 2009

Of course, I'm looking for a job, so that's my first priority.  And I've been busy interviewing, tech-interviewing, emailing, phoning, texting, IM-ing and all that stuff

In my very off hours I am busy writing stories for my son (T5).  The stories are primarily about two characters and it's mainly based in a small area (a town of twelve other characters).  For now, it's called The Adventures of Malcom & Bink.  The chief selling point is that they do all their own stunts!

Otherwise, I'm managing my Concrete5 (a brilliant CMS) site, and hacking Drupal and Joomla, with CSS tweaks and PHP.  

On the heavy-side, I'm doing lots of playing around in Linux for a zillion reasons, primarily the Ubuntu and Centos/Red Hat Enterprise editions.

One of my newest projects is a MySQL DBA Best Practices Wiki.  That's going to take a bit of work and time, but I think it's for the best.

 

New Domain and Site

By Tim
July 7th, 2009

Well, thaumatics.com is gone and that's ok because that was a hard domain to explain to people and too annoying to have to continually spell-out.  So, after some patient and diligent exploration, I discovered that clawhaven was free.  And it's easy, claw is a commonly used and simple word, and haven is kind of simple.  Of course, I worry about people getting it transposed and thinking havenclaw (rhyming with a Harry Potter allusion).

Regardless (not irregardless which is not even a proper word, as no proper words come from Indiana) I got the domain and, for now, I really like it.

I admit that I feel a bit bad about losing thaumatics because I know many people will probably look there for me.  And I feel bad because I just gave some nice, youngish girl, named Molly, that email address only days before I lost the domain.  So, if anyone knows a Molly that works at CBS who checks or veifiies claims in commercials and works in Manhattan, then let her know my new address is :

tim@clawhaven.com This e-mail address is being protected from spambots. You need JavaScript enabled to view it

This new domain is hosted on a Linux server (which I'm very ok with and prefer to Windows servers),  Originally, I had set this site up in Joomla.  And that took no time at all.  Oh wait, ok, it was simple, but the template that I really liked, called jf_classico was clearly made for Windows.  That was evident because when I installed it, there were a lot of complaints regarding files not existing, because of mixed-case filenames.  So, I made them all lower-case, re-installed it and it worked.

Then my friend Ihab told me about Concrete5 and I dived into the deep-end of that.  From Joomla to Concrete5 (which Ihab actually installed, to give credit where due), and I'll keep that as the CMS for this site (I think).

Lost My Job

By Tim
July 1st, 2009

 

While it's a new event, it's not really a shock, nor news-worthy.  I work(ed) in the publishing field, which is absolutely unstable, and with the effects of the current global, national and local economic conditions, make job losses in this sector more the rule than the exception.  And everyone in the department had long had suspicions, so, again, it was no complete shock.  Almost everyone I'd worked with the entire last decade is now out of work, it's astounding, kind of.

So, I'm on facebook, linkedin, monster, dice, theladders, hotjobs and others.

I'm getting lots of calls and emails and responses across the board, so that's wicked encouraging.

 
Built with concrete5 CMS. © 2010 clawhaven.com.    All rights reserved. Sign In to Edit this Site