1

I am trying to call ruby regex from C code:

#include <ruby.h>
#include "ruby/re.h"

int main(int argc, char** argv) {
    
    char string[] = "regex";
    ruby_setup();
    rb_reg_regcomp(string);
    return 0;
}

I compiled the newest version of ruby myself (commit 0b303c683007598a31f2cda3d512d981b278f8bd) and I link my program against it. It compiles with the warning:

fuzzer.c: In function ‘main’:
fuzzer.c:10:17: warning: passing argument 1 of ‘rb_reg_regcomp’ makes integer from pointer without a cast [-Wint-conversion]
   10 |  rb_reg_regcomp(string);
      |                 ^~~~~~
      |                 |
      |                 char *
In file included from fuzzer.c:4:
/home/cyberhacker/Asioita/Hakkerointi/Rubyregex/ruby/build/output/include/ruby-3.3.0+0/ruby/re.h:36:28: note: expected ‘VALUE’ {aka ‘long unsigned int’} but argument is of type ‘char *’
   36 | VALUE rb_reg_regcomp(VALUE str);

That I think is because the "VALUE" keyword in the ruby source code is a generic pointer to any type. When I try to run the program I get a segfault with this backtrace:

Program received signal SIGSEGV, Segmentation fault.
rb_enc_dummy_p (enc=enc@entry=0x0) at ../encoding.c:181
181     return ENC_DUMMY_P(enc) != 0;
(gdb) where
#0  rb_enc_dummy_p (enc=enc@entry=0x0) at ../encoding.c:181
#1  0x000055555569bd00 in rb_reg_initialize (obj=obj@entry=140737345038080, s=0xc62000007ffff78a <error: Cannot access memory at address 0xc62000007ffff78a>, len=-4574812796478291968, enc=enc@entry=0x0, options=options@entry=0, err=err@entry=0x7fffffffdb30 "", sourcefile=0x0, sourceline=0) at ../re.c:3198
#2  0x00005555556a11c8 in rb_reg_initialize_str (sourceline=0, sourcefile=0x0, err=0x7fffffffdb30 "", options=0, str=140737488346082, obj=140737345038080) at ../include/ruby/internal/core/rstring.h:516
#3  rb_reg_init_str (options=0, s=140737488346082, re=140737345038080) at ../re.c:3299
#4  rb_reg_new_str (options=0, s=140737488346082) at ../re.c:3291
#5  rb_reg_regcomp (str=140737488346082) at ../re.c:3373
#6  0x0000555555584648 in main () at ../include/ruby/internal/encoding/encoding.h:418

I tried to fiddle around with the type of the string which I pass to the function, but nothing really seemed to work. Expected behaviour is that it runs succesfully.

Can someone help? Thanks in advance!

2
  • 1
    "Compiles just fine" and "with the warning" don't really go together well. Commented Mar 16, 2023 at 12:01
  • @Gerhardh yeah I agree. I am gonna edit that Commented Mar 16, 2023 at 12:02

1 Answer 1

1

After a bit of digging I figured out that you need to convert the c string to a ruby string and then pass it to the function. I was confused, because in the documentation they say that: "Ruby’s String kinda corresponds to C’s char*." .

#include <ruby.h>
#include "ruby/re.h"
int main(int argc, char** argv) {
    VALUE x;
    char string[] = "regex";
    x = rb_str_new_cstr(string);
    rb_reg_regcomp(x);
    return 0;



}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.