2021 - 4

mysql limit optimization

Test data, mars_tianchi_user_actions (id, primary key index added)

Question: How to obtain numbers 10000001 to 10000005?

Use limit directly

mysql> select * from mars_tianchi_user_actions limit 10000000,5;
+----------+----------------------------------+--- -------------------------------+----------------+----- --------+----------+
| id | user_id | song_id | gmt_create | action_type | ds |
+----------+----------------------------------+--- -------------------------------+----------------+----- --------+----------+
| 10000001 | c4e89a01990bf87d745aa3b0cc6bd7aa | e5b475da38985734b6848130a31545a8 |
| 10000002 | f54c30493238f3f4dd79a4624fa20601 | f35da5a68294505c0e2f5dde6f8a2111 |
| 10000003 | f54c30493238f3f4dd79a4624fa20601 | ef776ef07e703d58c0af76c1031c0483 |
| 10000004 | d09af4bcf642b21fd13e11eb672e87f5 | 18b2cd447b72712a106bca7138187534 |
| 10000005 | d09af4bcf642b21fd13e11eb672e87f5 | 18b2cd447b72712a106bca7138187534 |
+----------+----------------------------------+--- -------------------------------+----------------+----- --------+----------+
5 rows in set (5.50 sec)

It took 5.50 seconds to view the explain

mysql> explain select * from mars_tianchi_user_actions limit 100000000,5;
+----+-------------+---------------------------+-- ----------+------+---------------+------+--------- +------+---------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
----+----------+
| 1 | SIMPLE | mars_tianchi_user_actions | NULL | ALL | NULL | NULL | NULL | NULL | 15763020 | 100.00 | NULL |
+----+-------------+---------------------------+-- ----------+------+---------------+------+--------- +------+---------+
1 row in set, 1 warning (0.00 sec)

Oh, it turns out to be a full table scan. How to use the index?

Connection query

mysql> select * from mars_tianchi_user_actions m , (select id from mars_tianchi_user_actions limit 10000000 ,5) t where m.id =t.id;
+----------+----------------------------------+--- -------------------------------+----------------+----- --------+----------+
| id | user_id | song_id | gmt_create | action_type | ds | id |
+----------+----------------------------------+--- -------------------------------+----------------+----- --------+----------+
| 10000001 | c4e89a01990bf87d745aa3b0cc6bd7aa | e5b475da38985734b6848130a31545a8 |
| 10000002 | f54c30493238f3f4dd79a4624fa20601 | f35da5a68294505c0e2f5dde6f8a2111 |
| 10000003 | f54c30493238f3f4dd79a4624fa20601 | ef776ef07e703d58c0af76c1031c0483 |
10000004 | d09af4bcf642b21fd13e11eb672e87f5 | 18b2cd447b72712a106bca7138187534 | 1427497200
10000005 d09af4bcf642b21fd13e11eb672e87f5 18b2cd447b72712a106bca7138187534
+----------+----------------------------------+--- -------------------------------+----------------+----- --------+----------+
5 rows in set (4.23 sec)
mysql> explain select * from mars_tianchi_user_actions m , (select id from mars_tianchi_user_actions limit 10000000 ,5) t where m.id =t.id;
+----+-------------+---------------------------+-- ----------+--------+---------------+---------+---- -----+----------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------------------------+-- ----------+--------+---------------+---------+---- -----+----------+
| 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 10000005 | 100.00 | NULL |
| 1 | PRIMARY | m | NULL | eq_ref | PRIMARY | PRIMARY | 8 | t.id | 1 | 100.00 | NULL |
| 2 | DERIVED | mars_tianchi_user_actions | NULL | index | NULL | PRIMARY | 8 | NULL | 15763020 | 100.00 | Using index |
+----+-------------+---------------------------+-- ----------+--------+---------------+---------+---- -----+----------+
3 rows in set, 1 warning (0.00 sec)

It took 4.23s, which is indeed much faster (thinking whether the derived table can be further optimized?)

Subquery

mysql> select * from mars_tianchi_user_actions m where m.id in (select id from mars_tianchi_user_actions limit 10000000,5);

C language structure size and minimum wasted space
Because of memory alignment issues, the different locations of each data type will lead to different sizes of structures. So in the end
How to calculate the size of a structure?
(I recently discovered this problem in a computer system class, so I had to pick up the C language again)
Many computer systems place some restrictions on the legal addresses of basic data types, requiring that the address of a certain type of object must be a multiple of a certain value K (usually 2, 4, or 8). This alignment restriction simplifies the design of the hardware that forms the interface between the processor and the memory system.

TypeK
char1
short2
int, float4
long,double,char*8

We can find that this K value is related to the data type (equal to). And this K value affects subsequent memory alignment.
When we create a structure:

typedef struct STRU{
char a;
short b;
double c;
char d;
float e;
char f;
long g;
int h;
}S;

After creating the structure, the situation in memory is:
The first data a is a character type with size 1, K=2, and the offset should be a multiple of 1, then offset=0 (the starting offset is 0);
The second data b short integer size is 2, K=2, and the offset should be a multiple of 2, then offset=2;
The size of the third data c double precision decimal type is 8, K=8, and the offset should be a multiple of 8, then offset=8;
The fourth data d character type size is 1, K=1, and the offset should be a multiple of 1, then offset=16;
The fifth data e single-precision decimal type size is 4, K=4, and the offset should be a multiple of 4, then offset=20;
The sixth data f character type size is 1, K=1, and the offset should be a multiple of 4, then offset=24;
The fifth data g long integer size is 4, K=4, and the offset should be a multiple of 4, then offset=28;
The integer size of the fifth data h is 4, K=4, and the offset should be a multiple of 4, then offset=32;
At this point, 36 space has been occupied in the memory. Since the size of the structure must be a multiple of the maximum member size of the structure members (8 in this structure), the size of the space allocated at this time should be 40 .
We can also use programs to verify:

#include <stdio.h>
int main()
{
typedef struct STRU{
char a;
short b;
double c;
char d;
float e;
char f;
long g;
int h;
}S;
S A = {"a",2,3.1,"d",5.3,"f",777777777777,8};
printf("a in 0x%x\n",&(A.a));
printf("b in 0x%x\n",&(A.b));
printf("c in 0x%x\n",&(A.c));
printf("d in 0x%x\n",&(A.d));
printf("e in 0x%x\n",&(A.e));
printf("f in 0x%x\n",&(A.f));
printf("g in 0x%x\n",&(A.g));
printf("h in 0x%x\n",&(A.h));
printf("Size=%d",sizeof(T));
return 0;
}

Output result:

a in 0x61fef8
b in 0x61fefa
c in 0x61ff00
d in 0x61ff08
e in 0x61ff0c
f in 0x61ff10
g in 0x61ff14
h in 0x61ff18
Size=40

In order to more intuitively display the distribution and alignment of the memory, you can view the following figure (white is wasted space, red stores data):

) can clearly write out the distribution and alignment of a structure according to some rules. Here are some simple rules I have summarized:

  • The offset of the data on the memory is related to K (the offset is a multiple of K).
  • The size of K is related to the data type (data type size = K, usually 2, 4 or 8).
  • The size of the structure must be a multiple of the size of the largest member of the structure.
    According to the figure, we can find that the wasted space is relatively large. We can effectively save space in the structure in descending order of data size.

    #include <stdio.h>
    int main()
    {
    typedef struct STRU{
    double c;
    float e;
    long g;
    int h;
    short b;
    char a;
    char d;
    char f;
    }S;
    S A = {"a",2,3.1,"d",5.3,"f",77777777,8};
    printf("c in 0x%x\t%d\n",&(A.c),sizeof(A.c));
    printf("e in 0x%x\t%d\n",&(A.e),sizeof(A.e));
    printf("g in 0x%x\t%d\n",&(A.g),sizeof(A.g));
    printf("h in 0x%x\t%d\n",&(A.h),sizeof(A.h));
    printf("b in 0x%x\t%d\n",&(A.b),sizeof(A.b));
    printf("a in 0x%x\t%d\n",&(A.a),sizeof(A.a));
    printf("d in 0x%x\t%d\n",&(A.d),sizeof(A.d));
    printf("f in 0x%x\t%d\n",&(A.f),sizeof(A.f));
    printf("Size=%d",sizeof(T));
    return 0;
    }

    result:

    c in 0x61ff00 8
    e in 0x61ff08 4
    g in 0x61ff0c 4
    h in 0x61ff10 4
    b in 0x61ff14 2
    a in 0x61ff16 1
    d in 0x61ff17 1
    f in 0x61ff18 1
    Size=32