Posts Tagged ‘innodb’

problem when creating a secondary index on a large innodb table

March 18th, 2011

The table has only 2 integer columns (id, friendId) with the combined key (id, friendId). Now I want to reversely search users by friendId. Without index friendId, each search will scan the whole table and takes about 90s. So I decided to create index on friendId.

First, I tried: CREATE INDEX ON TABLE tb_name index_name (column_name). After 30 hours, it’s still not finished and I had no idea when it would finish. So I interrupted it.

Then I truncated the whole table, altered the table to add a new index, and then insert the whole data by LOAD DATA INFILE …. INTO TABLE tb_name. During the procedure, only 5% of cpu was used and io_wait is large. And the final time it took was 55 hours:

mysql> LOAD DATA INFILE ‘/home/weiping/rr.dump/rr.friends.all.20100310.csv’ INTO TABLE friendships;
Query OK, 137138510 rows affected (2 days 7 hours 46 min 5.46 sec)

My my.cnf is like this:

delay_key_write=ALL
bulk_insert_buffer_size=2048M
key_buffer              = 768M
max_allowed_packet      = 16M
thread_stack            = 192K
thread_cache_size       = 144
query_cache_limit       = 1M
query_cache_size        = 32M
innodb_buffer_pool_size = 1024M

Of course, I have switched off these unnecessary checks:

SET unique_checks=0;
SET sql_log_bin=0;
set foreign_key_checks=0;

I loaded the data before without the secondary index, it took only 30 minutes! Now I’m wondering why it took so long time with one more index…

BTW, will increase the size of sort_buffer will help massive insertion (and with secondary index)?

notes for migrating large data in mysql (innodb)

March 13th, 2011

Here migrating means exporting & importing. The tips below may help you when you need to migrate a large amount of data. I’ll also list the resource consume of my exporting & importing of 2 big tables:

name definition rows
users CREATE TABLE `users` (
`id` int(11) NOT NULL,
`name` char(30) CHARACTER SET utf8 NOT NULL,
`city` char(50) CHARACTER SET utf8 DEFAULT NULL,
`insertTime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`assignedTo` smallint(5) unsigned NOT NULL DEFAULT ’0′,
`iconUrl` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `assignedTo` (`assignedTo`),
KEY `insertTime` (`insertTime`),
KEY `users_crawlTime` (`crawlTime`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
21,000,000
relations CREATE TABLE `friendships` (
`id` int(11) NOT NULL,
`friendId` int(11) NOT NULL,
PRIMARY KEY (`id`,`friendId`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
131,000,000

There are 2 ways to exprot & import data:

  • Using  SELECT … INTO OUTFILE … to export, and LOAD DATA INFILE … INTO TABLE … to import.
  • Using mysqldump … to export, and source … to import.

1. SELECT … INTO OUTFILE … & LOAD DATA INFILE … INTO TABLE …

The basic usage of SELECT INTO is:

SELECT … FROM tb_name INTO OUTFILE ‘/home/me/datadump/xxxx.txt’;

The basic usage of LOAD DATA is:

LOAD DATA INFILE ‘/home/me/datadump/xxxx.txt’ INTO TABLE tb_name [FIELDS TERMINATED BY 't' LINES TERMINATED BY 'n'];

1.1 Tips for SELECT INTO

  • use ORDER BY primary_key.
  • increase sort_buffer.

1.2 Tips for LOAD DATA

  • turn off foreign key checking, log bin & unique checking.
    set foreign_key_checks=0;
    set sql_log_bin=0;
    set unique_checks=0;
    load data ...
  • turn on delay_key_write and increase bulk_insert_buffer_size by editing [mysql] section in my.cnf. Here is a example:
    delay_key_write=ALL 
    bulk_insert_buffer_size=2048M

2. mysqldump … & source …

The basic grammar for mysqldump  is:

shell> mysqldump [options] db_name [tbl_name ...]
shell> mysqldump [options] --databases db_name ...
shell> mysqldump [options] --all-databases

And the source is easy to use:

mysql: source ‘/home/me/datadump/xxxx.sql’;

2.1 Tips for mysqldump

  • use switch –opt. This option is short for “–add-drop-table –add-locks –create-options –disable-keys –extended-insert –lock-tables –quick –set-charset”. The –opt option is enabled by default. You can use –skip-opt to disable it.
  • use switch –order-by-primary. Dump a table sorted by its primary key, or by its first unique index, if such an index exists. This is useful when dumping a MyISAM table to be loaded inInnoDB.

 

3. other tips:

  • use a single commit.
  • split a big file into several small files, such as 1M records in a file.