Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault when yk_mt_control_point commented out #31

Closed
Pavel-Durov opened this issue Jul 24, 2023 · 5 comments
Closed

Segmentation fault when yk_mt_control_point commented out #31

Pavel-Durov opened this issue Jul 24, 2023 · 5 comments
Assignees

Comments

@Pavel-Durov
Copy link
Contributor

When I comment out the yk_mt_control_point:

diff --git a/src/lvm.c b/src/lvm.c
index 03d947f..e84b0cc 100644
--- a/src/lvm.c
+++ b/src/lvm.c
@@ -1195,7 +1195,7 @@ uint32_t luaV_execute (lua_State *L, CallInfo *ci) {
     StkId ra;  /* instruction's A register */
     vmfetch();
 #ifdef USE_YK
-    yk_mt_control_point(G(L)->yk_mt, ykloc);
+    // yk_mt_control_point(G(L)->yk_mt, ykloc);
 #endif
     #if 0
       /* low-level line tracing for debugging Lua */

I get a segment fault at yk_location_drop:

Program received signal SIGSEGV, Segmentation fault.
0x00000000007fd350 in luaF_freeproto (L=<optimised out>, f=<optimised out>) at lfunc.c:276
276             yk_location_drop(f->yklocs[i]);
(gdb) bt
#0  0x00000000007fd350 in luaF_freeproto (L=<optimised out>, f=<optimised out>) at lfunc.c:276
#1  0x00000000007ffc9e in freeobj (L=0x90be38, o=0x91dbd0) at lgc.c:767
#2  0x000000000080ce05 in sweepgen (L=0x90be38, g=<optimised out>, p=<optimised out>, limit=<optimised out>, pfirstold1=<optimised out>) at lgc.c:1106
#3  0x000000000080c3d3 in youngcollection (L=0x90be38, g=0x90bf00) at lgc.c:1239
#4  0x000000000080b23d in genstep (L=0x90be38, g=0x90bf00) at lgc.c:1434
#5  0x000000000080ad04 in luaC_step (L=0x90be38) at lgc.c:1686
#6  0x00000000007d9f37 in lua_pushstring (L=0x90be38, s=<optimised out>) at lapi.c:553
#7  0x000000000089359c in findloader (L=0x90be38, name=0x91dd08 "tracegc") at loadlib.c:641
#8  0x0000000000892d9a in ll_require (L=0x90be38) at loadlib.c:666
#9  0x00000000007f3565 in precallC (L=0x90be38, func=<optimised out>, nresults=<optimised out>, f=0x892b10 <ll_require>) at ldo.c:506
#10 0x00000000007f3b06 in luaD_precall (L=0x90be38, func=0x918690, nresults=1) at ldo.c:569
#11 0x0000000000879701 in luaV_execute (L=0x90be38, ci=<optimised out>) at lvm.c:1667
#12 0x00000000007f462b in ccall (L=0x90be38, func=<optimised out>, nResults=<optimised out>, inc=<optimised out>) at ldo.c:609
#13 0x00000000007f4751 in luaD_callnoyield (L=0x90be38, func=0x90c610, nResults=-1) at ldo.c:627
#14 0x00000000007e09f3 in f_call (L=0x90be38, ud=<optimised out>) at lapi.c:1041
#15 0x00000000007ee6d7 in luaD_rawrunprotected (L=0x90be38, f=0x7e0930 <f_call>, ud=0x7ffff2499308) at ldo.c:144
#16 0x00000000007f6ef6 in luaD_pcall (L=0x90be38, func=0x7e0930 <f_call>, u=0x7ffff2499308, old_top=<optimised out>, ef=<optimised out>) at ldo.c:926
#17 0x00000000007e04dc in lua_pcallk (L=0x90be38, nargs=<optimised out>, nresults=<optimised out>, errfunc=<optimised out>, ctx=<optimised out>, k=<optimised out>) at lapi.c:1067
#18 0x00000000007d2113 in docall (L=0x90be38, narg=0, nres=-1) at lua.c:160
#19 0x00000000007d1814 in handle_script (L=0x90be38, argv=<optimised out>) at lua.c:255
#20 0x00000000007cfad3 in pmain (L=0x90be38) at lua.c:634
#21 0x00000000007f3565 in precallC (L=0x90be38, func=<optimised out>, nresults=<optimised out>, f=0x7cf2e0 <pmain>) at ldo.c:506
#22 0x00000000007f3bb8 in luaD_precall (L=0x90be38, func=0x90c5d0, nresults=1) at ldo.c:572
#23 0x00000000007f456f in ccall (L=0x90be38, func=0x90c5d0, nResults=1, inc=<optimised out>) at ldo.c:607
#24 0x00000000007f4751 in luaD_callnoyield (L=0x90be38, func=0x90c5d0, nResults=1) at ldo.c:627
--Type <RET> for more, q to quit, c to continue without paging--
#25 0x00000000007e09f3 in f_call (L=0x90be38, ud=<optimised out>) at lapi.c:1041
#26 0x00000000007ee6d7 in luaD_rawrunprotected (L=0x90be38, f=0x7e0930 <f_call>, ud=0x7ffff2499058) at ldo.c:144
#27 0x00000000007f6ef6 in luaD_pcall (L=0x90be38, func=0x7e0930 <f_call>, u=0x7ffff2499058, old_top=<optimised out>, ef=<optimised out>) at ldo.c:926
#28 0x00000000007e04dc in lua_pcallk (L=0x90be38, nargs=<optimised out>, nresults=<optimised out>, errfunc=<optimised out>, ctx=<optimised out>, 
    k=<optimised out>) at lapi.c:1067
#29 0x00000000007cefa0 in main (argc=<optimised out>, argv=<optimised out>) at lua.c:66

Yk commit: 5bcd2c575e0c59ea964e1bf6b11eb4c85142b140

@Pavel-Durov
Copy link
Contributor Author

Looking at this issue
Segmentation fault is coming from f->yklocs[i] cause f->yklocs is NULL

I can see that two loops has been detected and locations are set here for instructions: yklocs[29] and yklocs[845]
But when yk_location_drop(f->yklocs[29]) is called here f->yklocs is NULL

We suspect that this behaviour is caused by GC freeing the f->yklocs references.

@ltratt
Copy link
Contributor

ltratt commented Jul 25, 2023

And just to be clear we believe this has nothing to do with yk, per se: it's just that we haven't found quite the right places in (yk)lua to alter?

@vext01
Copy link
Contributor

vext01 commented Jul 25, 2023

This is subtly different, but related to this PR. The crash we saw this morning Pavel, was with the control point uncommented. But I bet the issue of freeing with the control point commented is related to GC too...

@ltratt
Copy link
Contributor

ltratt commented Jul 25, 2023

If the code crashes with the control point commented out, we'd also expect/hope it to crash in the same way with the control point not commented out. [Or, in other words, I believe this issue is orthogonal to the control point.]

@Pavel-Durov
Copy link
Contributor Author

Yes, either way ( yk_mt_control_point commented out or not) - we crash in the same way on f->yklocs[i] cause f->yklocs is NULL when we call yk_location_drop

@ltratt ltratt changed the title Segementation fault when yk_mt_control_point commented out Segmentation fault when yk_mt_control_point commented out Aug 15, 2023
bors bot added a commit that referenced this issue Aug 25, 2023
64: Track initialised yk locations r=ltratt a=Pavel-Durov

Partially fixes (only for serial compilation)  - #62

# Changes

- Remove reallocarray in favour of calloc
- Change YkLocations to be stored as pointers
- Change lyk interface to reflect its "hooks" functionality
- Move tests to a separate test script and enable serialised all.lua test suite
- Added logging in lyk module for ease of debugging
- Updated readme

`test.sh` was tested for resiliency multiple times:
```shell
$ try_repeat -v 100 sh ./test.sh
``` 
# Issues

## Uninitialised locations memory 
`reallocarray` does not initialise memory with default zero bytes locations as `calloc`. 
Moving to `calloc` and using pointers allowed to check for NULL with default values set at initialisation time.

## Tests

When lua tests are executed one by one as single files it produces different results from when it's executed via `all.lua` test suite.

Example:

```shell

$ YKD_SERIALISE_COMPILATION=1 ../src/lua -e"_U=true" ./main.lua 
...
../src/lua: ./main.lua:343: assertion failed!
stack traceback:
        [C]: in function 'assert'
        ./main.lua:343: in main chunk
        [C]: in ?
```
When run through `all.lua` it passes:
```shell
$ YKD_SERIALISE_COMPILATION=1 ../src/lua -e"_U=true" ./all.lua 
...
***** FILE 'main.lua'*****
...
***** FILE 'gc.lua'*****
...
```
Since `all.lua` is what we base the stability of yklua, I think we should include it in the tests as is, currently only serialised compilation works.

Running `all.lua` prior to these changes results in error (`main/56c5787799b876f36babbae24e9afc025806b831`):
```
$ YKD_SERIALISE_COMPILATION=1 gdb -ex 'r' -ex 'bt' --args ../src/lua -e"_U=true" ./all.lua 
...
***** FILE 'main.lua'*****

Program received signal SIGSEGV, Segmentation fault.
core::sync::atomic::AtomicUsize::fetch_sub (self=<optimised out>) at /rustc/ef85656a10657ba5e4f7fe2931a4ca6293138d51/library/core/src/sync/atomic.rs:2575
2575    /rustc/ef85656a10657ba5e4f7fe2931a4ca6293138d51/library/core/src/sync/atomic.rs: No such file or directory.
#0  core::sync::atomic::AtomicUsize::fetch_sub (self=<optimised out>) at /rustc/ef85656a10657ba5e4f7fe2931a4ca6293138d51/library/core/src/sync/atomic.rs:2575
#1  alloc::sync::{impl#33}::drop<lock_api::mutex::Mutex<parking_lot::raw_mutex::RawMutex, ykrt::location::HotLocation>, alloc::alloc::Global> (self=0x7fffffffb410)
    at /rustc/ef85656a10657ba5e4f7fe2931a4ca6293138d51/library/alloc/src/sync.rs:2370
#2  0x00007ffff7b0d4ab in core::ptr::drop_in_place<alloc::sync::Arc<lock_api::mutex::Mutex<parking_lot::raw_mutex::RawMutex, ykrt::location::HotLocation>, alloc::alloc::Global>> () at /rustc/ef85656a10657ba5e4f7fe2931a4ca6293138d51/library/core/src/ptr/mod.rs:497
#3  0x00007ffff7b0067e in core::mem::drop<alloc::sync::Arc<lock_api::mutex::Mutex<parking_lot::raw_mutex::RawMutex, ykrt::location::HotLocation>, alloc::alloc::Global>> (_x=...) at /rustc/ef85656a10657ba5e4f7fe2931a4ca6293138d51/library/core/src/mem/mod.rs:987
#4  0x00007ffff7b121f0 in ykrt::location::{impl#1}::drop (self=0x7fffffffb468) at ykrt/src/location.rs:197
#5  0x00007ffff7affa9b in core::ptr::drop_in_place<ykrt::location::Location> () at /rustc/ef85656a10657ba5e4f7fe2931a4ca6293138d51/library/core/src/ptr/mod.rs:497
#6  0x00007ffff7aff6bd in core::mem::drop<ykrt::location::Location> (_x=...) at /rustc/ef85656a10657ba5e4f7fe2931a4ca6293138d51/library/core/src/mem/mod.rs:987
#7  0x00007ffff7b004bd in ykcapi::yk_location_drop (loc=...) at ykcapi/src/lib.rs:90
#8  0x000000000088aa0e in free_loc (f=<optimised out>, i=<optimised out>, idx=<optimised out>) at lyk.c:66
#9  0x000000000088aba5 in yk_free_locactions (f=0x928cc0) at lyk.c:76
--Type <RET> for more, q to quit, c to continue without paging--
#10 0x0000000000807738 in luaF_freeproto (L=0x916e38, f=0x928cc0) at lfunc.c:276
#11 0x0000000000809fde in freeobj (L=0x916e38, o=0x928cc0) at lgc.c:767
#12 0x0000000000817145 in sweepgen (L=0x916e38, g=<optimised out>, p=<optimised out>, limit=<optimised out>, pfirstold1=<optimised out>) at lgc.c:1106
#13 0x0000000000816713 in youngcollection (L=0x916e38, g=0x916f00) at lgc.c:1239
#14 0x000000000081557d in genstep (L=0x916e38, g=0x916f00) at lgc.c:1434
#15 0x0000000000815044 in luaC_step (L=0x916e38) at lgc.c:1686
#16 0x00000000007e4447 in lua_pushstring (L=0x916e38, s=<optimised out>) at lapi.c:553
#17 0x000000000089e60c in findloader (L=0x916e38, name=0x928df8 "tracegc") at loadlib.c:641
#18 0x000000000089de0a in ll_require (L=0x916e38) at loadlib.c:666
#19 0x00000000007fda75 in precallC (L=0x916e38, func=<optimised out>, nresults=<optimised out>, f=0x89db80 <ll_require>) at ldo.c:506
#20 0x00000000007fe016 in luaD_precall (L=0x916e38, func=0x923780, nresults=1) at ldo.c:569
#21 0x0000000000883f68 in luaV_execute (L=0x916e38, ci=<optimised out>) at lvm.c:1655
#22 0x00000000007feb3b in ccall (L=0x916e38, func=<optimised out>, nResults=<optimised out>, inc=<optimised out>) at ldo.c:609
#23 0x00000000007fec61 in luaD_callnoyield (L=0x916e38, func=0x917730, nResults=-1) at ldo.c:627
#24 0x00000000007eaf03 in f_call (L=0x916e38, ud=<optimised out>) at lapi.c:1041
#25 0x00000000007f8be7 in luaD_rawrunprotected (L=0x916e38, f=0x7eae40 <f_call>, ud=0x7ffff2487308) at ldo.c:144
#26 0x0000000000801406 in luaD_pcall (L=0x916e38, func=0x7eae40 <f_call>, u=0x7ffff2487308, old_top=<optimised out>, ef=<optimised out>) at ldo.c:926
#27 0x00000000007ea9ec in lua_pcallk (L=0x916e38, nargs=<optimised out>, nresults=<optimised out>, errfunc=<optimised out>, ctx=<optimised out>, k=<optimised out>)
    at lapi.c:1067
#28 0x00000000007dc623 in docall (L=0x916e38, narg=0, nres=-1) at lua.c:160
#29 0x00000000007dbd24 in handle_script (L=0x916e38, argv=<optimised out>) at lua.c:255
#30 0x00000000007d9fe3 in pmain (L=0x916e38) at lua.c:634
#31 0x00000000007fda75 in precallC (L=0x916e38, func=<optimised out>, nresults=<optimised out>, f=0x7d97f0 <pmain>) at ldo.c:506
#32 0x00000000007fe0c8 in luaD_precall (L=0x916e38, func=0x9176f0, nresults=1) at ldo.c:572
#33 0x00000000007fea7f in ccall (L=0x916e38, func=0x9176f0, nResults=1, inc=<optimised out>) at ldo.c:607
#34 0x00000000007fec61 in luaD_callnoyield (L=0x916e38, func=0x9176f0, nResults=1) at ldo.c:627
#35 0x00000000007eaf03 in f_call (L=0x916e38, ud=<optimised out>) at lapi.c:1041
#36 0x00000000007f8be7 in luaD_rawrunprotected (L=0x916e38, f=0x7eae40 <f_call>, ud=0x7ffff2487058) at ldo.c:144
#37 0x0000000000801406 in luaD_pcall (L=0x916e38, func=0x7eae40 <f_call>, u=0x7ffff2487058, old_top=<optimised out>, ef=<optimised out>) at ldo.c:926
#38 0x00000000007ea9ec in lua_pcallk (L=0x916e38, nargs=<optimised out>, nresults=<optimised out>, errfunc=<optimised out>, ctx=<optimised out>, k=<optimised out>)
    at lapi.c:1067
#39 0x00000000007d94b0 in main (argc=<optimised out>, argv=<optimised out>) at lua.c:660
```

That's cause we're freeing uninitialised yk locations.


Co-authored-by: Pavel Durov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants