View Issue Details

IDProjectCategoryView StatusLast Update
0007820VCOMPONENT-VKERNELAPIpublic2016-12-28 16:50
ReporterChris ZakrewskyAssigned ToIvan Smahin 
PriorityimmediateSeveritycrashReproducibilityrandom
Status resolvedResolutionfixed 
PlatformINTELOSWindowsOS Version*
Product Version 
Target VersionFixed in Version6.6.x 
Summary0007820: Compound index issues
DescriptionI get a lot of database corruptions on our Windows 2008 R2 Servers.
They are running scheduled Reindex, Defragment and Compress.
Especially Uniques are prone to corruption.


VServer.log can show:

Sun Nov 27 2016 23:30:13.603000 (TID 8612): (UID 600) ERROR 0x86000: Compressor error.
Sun Nov 27 2016 23:30:22.619000 (TID 10708): Scheduler sleeps waiting for 'Nomadic_DefragmentDb' until '2016-11-28 01:00:00:000'

Warnings.log can show:

20161128_000748 : Eat throw from DeleteRecordFromIndex() : .ObjectGroup_member_unique. RecID: 516. Msg: Index error. Please reindex.
20161128_000748 : Eat throw from DeleteRecordFromIndex() : .EntityMeta_unique_id_name. RecID: 1762. Msg: Index error. Please reindex.
20161128_000838 : Eat throw from DeleteRecordFromIndex() : .PointsSmall_geom_id_point_idx_unique. RecID: 42417. Msg: Index error. Please reindex.
20161128_000838 : Eat throw from DeleteRecordFromIndex() : .PointsSmall_geom_id_point_idx_unique. RecID: 42418. Msg: Index error. Please reindex.
20161128_000839 : Eat throw from DeleteRecordFromIndex() : .ObjectGroup_member_unique. RecID: 409. Msg: Index error. Please reindex.
20161128_000839 : Catched ERR_INDEX_CORRUPTED. : .EntityMeta_unique_id_name. RecID: 1527. Msg: Index "EntityMeta_unique_id_name" is corrupted. Please reindex.
20161128_000839 : Catched ERR_INDEX_CORRUPTED. : .EntityMeta_unique_id_name. RecID: 1528. Msg: Index "EntityMeta_unique_id_name" is corrupted. Please reindex.
20161128_000839 : Catched ERR_INDEX_CORRUPTED. : .EntityMeta_unique_id_name. RecID: 1529. Msg: Index "EntityMeta_unique_id_name" is corrupted. Please reindex.

The same happen even a few hours after reindexing:

20161123_105959 : Reindex of : entity_id
20161123_105959 : Reindex of : constr_id
20161123_105959 : Reindex of : flag_trim_profile
20161123_105959 : Reindex of : entity_id
20161123_105959 : Reindex of : Kind
20161123_105959 : Reindex of : ColumnName
20161123_105959 : Reindex of : ParentPtr
20161123_105959 : Reindex of : ItemDescrPtr
20161123_105959 : Reindex of : ParentPtr
20161123_105959 : Reindex of : fld_name
20161123_105959 : Reindex of : fld_isProject
20161123_105959 : Reindex of : fld_name
20161123_105959 : Reindex of : fld_is_admin
20161123_105959 : Reindex of : fld_sysuser_ptr
20161123_105959 : Reindex of : fld_privilege_option
20161123_105959 : Reindex of : fld_parentobject_ptr
20161123_105959 : Reindex of : fld_sysdatabase_ptr
20161123_105959 : Reindex of : fld_name
20161123_105959 : Reindex of : fld_enabled
20161123_105959 : Reindex of : fld_preserve_on_completion
20161123_125533 : Catched ERR_INDEX_CORRUPTED. : .ObjectGroup_member_unique. RecID: 352. Msg: Index "ObjectGroup_member_unique" is corrupted. Please reindex.
20161123_125533 : Catched ERR_INDEX_CORRUPTED. : .ObjectGroup_member_unique. RecID: 353. Msg: Index "ObjectGroup_member_unique" is corrupted. Please reindex.
20161123_125533 : Catched ERR_INDEX_CORRUPTED. : .EntityId_uuid_version_unique. RecID: 92. Msg: Index "EntityId_uuid_version_unique" is corrupted. Please reindex.
20161123_125710 : Catched ERR_INDEX_CORRUPTED. : .ObjectGroup_member_unique. RecID: 302. Msg: Index "ObjectGroup_member_unique" is corrupted. Please reindex.

Diagnostics show corruption of tables:

***************************************************************************
Table: ObjectData
***************************************************************************
==========> Field: entity_id , type Unsigned Long
==========> IndexFile
PAGE (0, 0) Inventory pages OK.
Index Inventory pages OK.
PAGE (3, 2) OK.
OK.
All right! ( IndexFile )
All right! ( Field: entity_id , type Unsigned Long )
==========> Field: data_blob , type BLOB
.....> SegmentFile:
"FILE: ../../../sources/VKernel/FBL/prot/LowLevel/Volume/VolumeOnFile/prot_SegmentMap/Tree/FBL_SegmentMap_RBTree.cpp FUNCTION: Search LINE: 355" SegmentMap is corrupted - "UNKNOWN"
Problems found! ( SegmentFile: )
Problems found! ( Field: data_blob , type BLOB )
Problems found! ( Table: ObjectData )

Or:

***************************************************************************
Table: ObjectData
***************************************************************************
==========> Field: entity_id , type Unsigned Long
==========> IndexFile
PAGE (0, 0) Inventory pages OK.
Index Inventory pages OK.
PAGE (3, 2) OK.
OK.
All right! ( IndexFile )
All right! ( Field: entity_id , type Unsigned Long )
==========> Field: data_blob , type BLOB
.....> SegmentFile:
                    Segment 191 has wrong mIndex must be 3 in fact 0
                    Segment 191 has wrong mFirstSegment must be 189 in fact 0
                    Segment 191 has wrong mPrevSegment must be 190 in fact 0
Problems found! ( SegmentFile: )
Problems found! ( Field: data_blob , type BLOB )
Problems found! ( Table: ObjectData )


The database can no longer be rescued because SQL dump crashes (see the screenshot)

How can I go about to catch exactly what and when causes all this mischief?
Is there any particular API which you are using in your Diagnostics which could be called directly from VCDK after each query to diagnose for instance the status of reindexing?



TagsNo tags attached.

Activities

Chris Zakrewsky

Chris Zakrewsky

2016-11-28 12:40

reporter  

Snapshot-2016-11-27-230730.png (20,042 bytes)
Chris Zakrewsky

Chris Zakrewsky

2016-11-28 12:44

reporter   ~0009360

Or:

***************************************************************************
Database:
***************************************************************************

.....> DBStorage:
.....> VolumeOnFile: /home/chris/Public/RVS/Paketering/2016.11.27/DB/Nomadic3.vdb
File {1125912791744517}: the tree corrupted: balance violated.

Problems found! ( VolumeOnFile: /home/chris/Public/RVS/Paketering/2016.11.27/DB/Nomadic3.vdb )


Problems found! ( DBStorage: )
Ruslan Zasukhin

Ruslan Zasukhin

2016-11-28 12:54

administrator   ~0009361

Hi Chris,

problem not happens if not do Defragment and Compress?

Please try avoid this operations for now.
I remember that some time ago, somebody try them and there was similar picture,
but Ivan never was able to reproduce problems.

If you have some db in GOOD STATE,
which is possible corrupt, we will be thankful
Ruslan Zasukhin

Ruslan Zasukhin

2016-11-28 12:55

administrator   ~0009362

Compressor error: you use compression on some fields?
Ruslan Zasukhin

Ruslan Zasukhin

2016-11-28 13:00

administrator   ~0009363

Well, something weired, and I think we hear this is the first time from you?

I think we will be able spend nearest days if needed to catch and fix this trouble.
So lets think how we can do that together.

We need understand as much as possible about your db and processes.
Chris Zakrewsky

Chris Zakrewsky

2016-11-28 13:03

reporter   ~0009364

I am sending you by e-mail a small database which became corrupted yesterday.
Chris Zakrewsky

Chris Zakrewsky

2016-11-28 13:14

reporter   ~0009365

> Compressor error: you use compression on some fields? Yes, on text blobs (I store XML files there).

Yesterday I saw corruption during a course of normal operation however.
The day started with Diagnostics where I saw problems in DB (the one attached), brought it off-line and tried several things:

- dump SQL: failed.
- dump XML: succeeded
- clone DB: succeded

Cloned database no longer triggered errors in Diagnostics but the data in it was decimated, a few blobs were empty (yes, these compressed Blobs!)
 
I put it on-line, repaired missing data by removing corrupted objects and transferring their copies back from a sister database on another server.
After that I ran diagnostics again -- and it found new Reindex errors!
Reindexed again and called it a day.
Chris Zakrewsky

Chris Zakrewsky

2016-11-28 13:22

reporter   ~0009366

I forgot to add that these Windows 2008 servers are quite under-powered performance-wise and that it may be a contributing factor - they are very, very slow, have a lot of RAM and large (very slow) disks, well, you get the picture.
Chris Zakrewsky

Chris Zakrewsky

2016-11-28 15:50

reporter  

DB-Empty-but-not-really-good.zip (28,215 bytes)
Chris Zakrewsky

Chris Zakrewsky

2016-11-28 15:52

reporter   ~0009367

For completness of this issue attached a minimal dataset.

1. Open the Nomadic_good_empty.vdb in Studio 6.6.2 as local file.
2. Run Reindex.
3. Run Advanced Diagnostics ---> No errors.
4. Close database.
5. Close Studio.
6. Open the Nomadic_good_empty.vdb in Studio 6.6.2 as local file again.
7. Do NOT RUN Reindex
8. Run Advanced Diagnostics ---> Errors found!

Conclusion: in-memory Reindexing does not get written to index files!
Chris Zakrewsky

Chris Zakrewsky

2016-11-29 16:46

reporter   ~0009370

Last edited: 2016-11-29 16:46

View 2 revisions

The same scenario can be reproduced entirely outside Studio:

1. Open the Nomadic_good_empty.vdb via SDK (Database_Open())
2. Run Reindex (via SQL).
3. Run Database_Diagnose() ---> No errors.
4. Close database.
5. Shut down Client.
6. Stop - and - restart VServer.
7. Start Client again.
8. Open the Nomadic_good_empty.vdb via SDK (Database_Open()) again.
9. Do NOT RUN Reindex
10. Run Database_Diagnose() ---> Errors found!

As long as VServer is not restarted, reindexing sticks, no matter how many times Client is restarted. However after restarting VServer it is gone (according to Database_Diagnose()). Steps 5 and 7 are optional, it is restarting VServer what matters here.

Ruslan Zasukhin

Ruslan Zasukhin

2016-11-29 21:30

administrator   ~0009371

I have asked IVan to inform you, but it seems he forget
Ivan have reproduce some issue with diagnoze of compounnd index.
He make fixes.

After that we will make a new hotfix build,
and we will see if anything else exists.
Chris Zakrewsky

Chris Zakrewsky

2016-12-04 19:23

reporter   ~0009376

Last edited: 2016-12-04 19:28

View 2 revisions

Unfortunately index corruption still exist.
I am sending you (via e-mail) a small database where you can very easily reproduce it on Valentina 6.6.4.

As per attached screenshots, the steps are:

1. Open this database in VStudio
2. Run Diagnostics on database --> No errors
3. Open table EntityId
4. Delete ANY record there
5. Run Diagnostics on database --> Errors found!

"Compound index: 280. Index record count (52) differ from table one (51)"

Once these errors start on delete record, they escalate to pop up also on insert record (and from there it goes straight downhill).

Also no DEFRAGMENT nor COMPRESS are involved.

Chris Zakrewsky

Chris Zakrewsky

2016-12-04 19:24

reporter  

Snapshot-2016-12-04-200602.png (7,629 bytes)
Chris Zakrewsky

Chris Zakrewsky

2016-12-04 19:24

reporter  

Snapshot-2016-12-04-200728.png (43,681 bytes)
Chris Zakrewsky

Chris Zakrewsky

2016-12-04 19:41

reporter  

Snapshot-2016-12-04-200940.png (18,273 bytes)
Chris Zakrewsky

Chris Zakrewsky

2016-12-04 21:36

reporter   ~0009378

Maybe unrelated, but whenever indexes are reported damaged, I usually get crash when unloading Valentina:

Exception 0xc0000005 at Sun Dec 04 19:37:39 2016

Stack dump:
     EBP RetAddr arg1 arg2 arg3 arg4
 0: 0029e850 6a426336 d269b906 0029e960 0029e8cc 5077bdc8 <nosymbol>
 1: 0029e888 65786979 00000000 d269b916 0029e8c4 5296612a VShared_release_x86!fbl::DatabaseManager::Destroy_Databases+0x86 [omap]
 2: 0029e898 5296612a 5521476d 0029e960 0029e8cc 5077bdc8 VKernel_release_x86!fbl::ValentinaShutDown+0x59 [omap]
 3: 0029e8c4 59a698c2 0029e9fc 18fb8be8 cccccccc cccccccc vcsdk_release_x86!Valentina_Shutdown+0x5a [omap]
Chris Zakrewsky

Chris Zakrewsky

2016-12-11 22:21

reporter   ~0009394

Just a quick feedback:

Now it is a bit better in 6.6.5 than it was in 6.6.4, but I still get index corruptions :(

I'll try to make a reproducible dataset as soon as I can, this issue proves to be a real die hard...
Chris Zakrewsky

Chris Zakrewsky

2016-12-13 17:04

reporter  

Snapshot-2016-12-13-175139.png (28,469 bytes)
Chris Zakrewsky

Chris Zakrewsky

2016-12-13 17:04

reporter  

Snapshot-2016-12-13-175323.png (50,453 bytes)
Chris Zakrewsky

Chris Zakrewsky

2016-12-13 17:05

reporter  

Snapshot-2016-12-13-180303.png (29,904 bytes)
Chris Zakrewsky

Chris Zakrewsky

2016-12-13 17:08

reporter  

Valentina_6.6.5_Indexing_BUG.zip (170,981 bytes)
Chris Zakrewsky

Chris Zakrewsky

2016-12-13 17:18

reporter   ~0009398

Here it goes: 100% reproducible, both Windows and Linux.
I am referring to screenshots and the dataset uploaded today, 2016-12-13:

1. Open database in Studio.
2. Run Diagnostics on database --> No errors
3. Open table ObjectGroup
4. Delete the only record there (first screenshot)
5. Open SQL console and run the SQL sequence as in bug_test.sql (second screenshot)

Result: No problem.

6. Run the SQL sequence as in bug_test.sql ONE MORE TIME (second screenshot)

Result: This time kernel detects a constraint violation (AND CORRUPTS THE IN-MEMORY INDEX)

7. Run Diagnostics on database --> Problems found (third screenshot)
Ruslan Zasukhin

Ruslan Zasukhin

2016-12-28 16:50

administrator   ~0009428

6.6.9 --

issue was found when Diagnoze do UNIQUE index, in some cases.
FIXED.

Disk files are ok,
only RAM structures where in trouble.

Issue History

Date Modified Username Field Change
2016-11-28 12:40 Chris Zakrewsky New Issue
2016-11-28 12:40 Chris Zakrewsky File Added: Snapshot-2016-11-27-230730.png
2016-11-28 12:44 Chris Zakrewsky Note Added: 0009360
2016-11-28 12:54 Ruslan Zasukhin Note Added: 0009361
2016-11-28 12:55 Ruslan Zasukhin Note Added: 0009362
2016-11-28 13:00 Ruslan Zasukhin Note Added: 0009363
2016-11-28 13:03 Chris Zakrewsky Note Added: 0009364
2016-11-28 13:14 Chris Zakrewsky Note Added: 0009365
2016-11-28 13:22 Chris Zakrewsky Note Added: 0009366
2016-11-28 15:50 Chris Zakrewsky File Added: DB-Empty-but-not-really-good.zip
2016-11-28 15:52 Chris Zakrewsky Note Added: 0009367
2016-11-29 16:46 Chris Zakrewsky Note Added: 0009370
2016-11-29 16:46 Chris Zakrewsky Note Edited: 0009370 View Revisions
2016-11-29 21:30 Ruslan Zasukhin Note Added: 0009371
2016-11-30 08:05 Ivan Smahin Status new => resolved
2016-11-30 08:05 Ivan Smahin Fixed in Version => 6.6.x
2016-11-30 08:05 Ivan Smahin Resolution open => fixed
2016-11-30 08:05 Ivan Smahin Assigned To => Ivan Smahin
2016-12-04 19:23 Chris Zakrewsky Note Added: 0009376
2016-12-04 19:23 Chris Zakrewsky Status resolved => feedback
2016-12-04 19:23 Chris Zakrewsky Resolution fixed => reopened
2016-12-04 19:24 Chris Zakrewsky File Added: Snapshot-2016-12-04-200602.png
2016-12-04 19:24 Chris Zakrewsky File Added: Snapshot-2016-12-04-200728.png
2016-12-04 19:24 Chris Zakrewsky File Added: Snapshot-2016-12-04-200940.png
2016-12-04 19:27 Chris Zakrewsky Status feedback => assigned
2016-12-04 19:28 Chris Zakrewsky Note Edited: 0009376 View Revisions
2016-12-04 19:41 Chris Zakrewsky File Deleted: Snapshot-2016-12-04-200940.png
2016-12-04 19:41 Chris Zakrewsky File Added: Snapshot-2016-12-04-200940.png
2016-12-04 21:36 Chris Zakrewsky Note Added: 0009378
2016-12-07 08:19 Ivan Smahin Status assigned => resolved
2016-12-07 08:19 Ivan Smahin Resolution reopened => fixed
2016-12-07 10:36 Ivan Smahin Status resolved => feedback
2016-12-07 10:36 Ivan Smahin Resolution fixed => reopened
2016-12-07 10:37 Ivan Smahin Category Defragment => API
2016-12-07 10:37 Ivan Smahin Summary VServer 6.6.2: Reindexing Uniques crashes database => Compound index issues
2016-12-07 10:37 Ivan Smahin Status feedback => resolved
2016-12-07 10:37 Ivan Smahin Resolution reopened => fixed
2016-12-11 22:21 Chris Zakrewsky Note Added: 0009394
2016-12-11 22:21 Chris Zakrewsky Status resolved => feedback
2016-12-11 22:21 Chris Zakrewsky Resolution fixed => reopened
2016-12-13 17:04 Chris Zakrewsky File Added: Snapshot-2016-12-13-175139.png
2016-12-13 17:04 Chris Zakrewsky File Added: Snapshot-2016-12-13-175323.png
2016-12-13 17:05 Chris Zakrewsky File Added: Snapshot-2016-12-13-180303.png
2016-12-13 17:08 Chris Zakrewsky File Added: Valentina_6.6.5_Indexing_BUG.zip
2016-12-13 17:18 Chris Zakrewsky Note Added: 0009398
2016-12-13 17:18 Chris Zakrewsky Status feedback => assigned
2016-12-28 16:50 Ruslan Zasukhin Note Added: 0009428
2016-12-28 16:50 Ruslan Zasukhin Status assigned => resolved
2016-12-28 16:50 Ruslan Zasukhin Resolution reopened => fixed