-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathTODO
254 lines (219 loc) · 12.2 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
How to log all queries:
set global log_output = 'FILE';
set global general_log_file = '/tmp/mysql.log';
set global general_log = "ON";
Run the MySQL test suite:
MTR_MAX_TEST_FAIL=1000 ./mysql-test-run.pl --force --suite=engines/funcs --mysqld=--default-storage-engine=upscaledb --mysqld=--plugin_load=ha_upscaledb.so --mysqld=--default-tmp-storage-engine=upscaledb engines/funcs.crash_manyindexes_number
- engines/func
- engines/iuds
x Complex keys made up of multiple columns have different sort order than
in innodb (i.e. tpcc/customer table)
PRIMARY KEY (`c_w_id`,`c_d_id`,`c_id`)
-> verify this once more
-> yes, but upscaledb's sort order is clearly defined and seems to be
more correct
x Is the MaxKeyCache properly initialized when the database is initialized?
-> yes
x There's a race condition when using the MaxKeyCache:
- in compare_and_update(), which requires a mutex
- between querying the cache and actually performing the database operation
- there are more issues; the cache is not updated when a key is deleted
x would it make sense to move this to upscaledb core? - no, not required
x use scoped_ptr for the MaxKeyCache - not reqd
x remove the MaxKeyCache; the upscaledb core already has a similar
optimization in btree_insert.cc: it uses the hinter to decide
if append/prepend is possible!
x make UPS_HINT_APPEND, UPS_HINT_PREPEND internal
x store the "historgram" in the database statistics
x initialize and compare to get the HINT flag
x ups_db_insert
x ups_cursor_insert
x delete histogram if one of the keys is deleted
x txn_insert: if UPS_HINT_APPEND/-PREPEND is set then do not
look up the btree!
x insert optimization: if the key is appended or prepended then
skip the btree lookup (via the UPS_HINT_* flag)
x erase optimization: if the deleted item is in the histogram
then delete it (the histogram must then be re-initialized whenever
it's accessed again)
x txn lookup optimization: if the key is outside of the histogram
then fail immediately (only if the histogram was initialized)
x It seems that MySQL's handler class already has functions to pack and unpack
keys, i.e. see calculate_key_len(). Can we switch to these functions,
instead of using our own?
-> no, leave as is; switching has no benefits
x implement index_read instead of index_read_map
o Implement BEGIN and COMMIT
o Currently we cannot share a transaction over multiple tables, because each
table is in its own upscaledb environment. We have to store all
tables in the same environment.
x create a new struct "Catalogue" which stores the env and the mapping
of "table/index" to the database name
x The environment is stored in a global map, indexed by the database
name (managed by the Catalogue)
x Each table is managed by Catalogue::Table and stores the indices
(i.e. std::vector<ups_db_t *>) - very similar to the current share
x The UpscaledbHandler then uses the Catalogue::Table to store and
access its state (databases, autoidx, ref_length etc)
x Do not close the Environment after it was created; only open it
if necessary - no, this doesn't work b/c the Field pointers
are temporary
x Merge the CreateInfo structure with the new catalogue
x So far this is not a significant change - make sure that all tests
are still running
x Failures b/c database needs recovery
x Store ups_srv_t in Catalogue (next to ups_env_t)
o Read the configuration from "upscaledb.cnf" - but no longer use
the COMMENT string!
x store the configuration directly in the Catalogue
x when the environment is created: add an empty ".cnf" template
(unless it already exists)
x improve error handling - parser prints to stderr
x how to pass initial values (i.e. page_size)? - they have to
be created PRIOR to the first table! -> document this
-> make sure that this works
x how to pass per-table values (i.e. for record compression)?
x use the COMMENT string, store info in the table
x print this in the template - there should be at least 1 table
with values (even if they're the default)
x when reading the config file: $table.$key = $value
o keys and settings should be case-ignorant (convert everything to
lower case)
o Remove the remaining traces of the UpscaledbShare and
UpscaledbTableShare classes
o check ha_rocksdb.cc how they do the locking
o ALL tests must be running
o also test compression
o ... and remote access
o Reuse the Environment *per database*, not per table! This is the
critical stage. Need to allocate databases, and a mapping for
"table_name/index" to actual database names
o rewrite implementation of rename_table()
o rewrite implementation of truncate_table()
o make sure that temp. tables are created in an in-memory environment
o currently, ::open and ::create lock the databases_mutex for a
long time; try to increase concurrency
o when are the databases closed, when the environment?? currently
they are never closed, but it would be nice if they were
shut down properly
o clean up the code
o all tests need to run!
o identify the callback functions in the handler that have to be
implemented
-> handler::external_lock()
-> handler::start_stmt()
-> handler::end_stmt() (a.k.a reset())
o upscaledb needs to block in case of a conflict (new behaviour - add
a flag when creating the Transaction!)
o add flag to wrapper libraries
o OR: new API to discover which Transaction is blocking, and a new
API to wait for it (incl. timeout)
-> verify this! InnoDB blocks, but has a timeout. Either we
have to pass the timeout to the Transaction, or the Transaction is
blocked in MySQL (not InnoDB)!
o can we streamline/clean up the transaction handling of the cursor? And
make the cursor part of the currently active transaction? Then we
can avoid a few lookups
o see mysql-tests/suite/engines/README (item #3) on how to test this
o make sure that sysbench oltp now works as expected (with multiple
connections AND transactions)
o Implement index condition pushdown
-> pass a function pointer to the upscaledb core (register it as a
callback), then use a new "UQI scan" method to fast-forward
the cursor
o implement index_read_map_last (or rather: index_read_last), set
the flag HA_READ_ORDER (can be tested w/ wordpress)
o implement UpscaledbHandler::records_in_range()
o implement UpscaledbHandler::records() (does this make things faster??)
o implement UpscaledbHandler::extra()?
. implement UpscaledbHandler::info()
innobase_hton->show_status = innobase_show_status;
-> print info of each environment and of each server, including all
configuration parameters (page size, cache size etc)
. implement handler::check()
o enable integer compression
o TPC-C does not yet work; the loader seems to be fine, but tpcc_run fails
LD_LIBRARY_PATH=/usr/local/mysql/lib time ./tpcc_load -h localhost -d tpcc1000i -u root -p "" -w 1
LD_LIBRARY_PATH=/usr/local/mysql/lib time ./tpcc_start -h127.0.0.1 -P3306 -dtpcc1000u -uroot -w1 -c32 -r10 -l10800
x load upscaledb and innodb with 1 warehouse; compare both databases
-> they're ok (except that compound keys are sorted differently)
o then start running the tests; compare queries (payment and ordstat
seem to fail)
x can we fill each database separately? - no
x enable query logging
x write a perl script which executes the queries simultaneously on
both databases and compares the results
x after fill: SELECT count(c_id) FROM customer WHERE c_w_id = 1 AND c_d_id = 7 AND c_last = 'ABLEESEPRI';
-> result is 79, must be 15!
-> theory: only the first column (c_w_id) is compared, but not the
second one (in UpscaledbHandler::read_next_same)
o still get errors during "ramp-up time" (and in sysbench)
-> this is caused by concurrency issues. it is NOT a load issue -
idb and ups create identical databases. but when running
tpcc_start with 1 connection then everything is correct.
-> otoh, loading is always with 1 connection; we therefore can't
proof that loading is correct
-> sysbench also fails when using it with multiple threads.
x is this only related to the AUTO_INCREMENT field? hack sysbench
and insert a "real" ID instead of a null value. See if the
problem goes away
-> there's an option "oltp_auto_inc" == "off"
-> no, the problem persists
o and when running: 1030, HY000, Got error 1 from storage engine
-> might be related
==> requires implementation of BEGIN/COMMIT
o properly close the environments when the server is shut down
o CREATE TABLE (id integer primary key) -> does not have to store any
records! (but it does)
o create a list of manual tests that we have to perform:
o testing bad configuration file settings
o testing ups_server (i.e. with ups_info)
o ...
-> try to automate as much as possible
o update the documentation
o configuration: how to pass initial values (i.e. page_size)? - they have to
be created PRIOR to the first table! -> document this
o other configuration updates
o we need to release 2.2.1...
o public release ---------------------------------------------------------
o secondary keys should not require a lookup of the primary key;
directly store the blob id of the record
-> look at BDB, they have a similar API
o support encryption
m_create_info->used_fields & HA_CREATE_USED_PASSWORD (handler.h)
https://dev.mysql.com/doc/refman/5.7/en/innodb-tablespace-encryption.html
using a cheap solution or an expensive one?
. improve auto-inc performance; the current code always calls index_last() to
get the last known value. Use recno-databases instead!
but remember that it's possible to use UPDATE to change the key, as long as
the key was not yet used.
-> or can we cache index_last() instead?
. allow in-memory environments (comment option: "in_memory=true|false")
currently would not work because ::create() closes the Environment,
::open() opens it. In-memory environments have to remain open!
-> when creating, store the UpscaledbData globally; do not close the
Environment again (and therefore do not open it again in ::open())
. how can we improve performance of duplicate keys?
cheap solution: split duplicate table in blocks, only merge blocks when
they're empty
better solution (more effort): blobs form a linked list
pro: index will become smaller and therefore faster
pro: less code
con: how to consolidate them then?
-> see upscaledb's TODO file
. online DDL:
http://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-overview.html
. need a code generator for the remote interface that shows how to
de-serialize the database records
. identify the 3 most commonly used MySQL operations in Wordpress; use
the API instead and see if it improves performance
-> can we move this to a wordpress plugin??
. engines/iuds.strings_charsets_update_delete
-- select * from t8 where a like 'uu%';
-- index_read_map locates the second key via approx. matching. The
first key is not a match because it starts with a 'U' instead
of a 'u'.
-- however, the mysql collation is case-insensitive and therefore
includes 'U'!
-- have to compare using the collation of the current field, if there
is one; if necessary, move the cursor "to the left"